Reputation: 5768
I'm trying to gather some information on the best possible way to collect tweets and store them in a database
. I've been looking at the Twitter Streaming API
and looking at an interface called Phirehose
that seems to enable an easy to set up way of tapping in to this Stream and collecting data.
I'm just wondering If this is teh only way? Or if someone might recommend a better method of doing this?
I apologize for how broad the question is, but I'm just trying to gain some information that might point me in the right direction.
Upvotes: 1
Views: 1235
Reputation: 28968
Phirehose is designed for the use-case you describe: it takes care of the connection (and also takes care of things like the back-off when your reconnects fail).
You mentioned only interested in a certain geographical area. Use Phirehose's setLocation()
to do that. See filter-track-geo.php in the phirehose examples directory for how to do that. (But note that you miss out on tweets by users who are living next door to you but decide not to give their location in their tweets.)
The alternative is to not use the streaming API and poll using the standard REST API. As far as I know that gives you nothing the streaming API does not, but with more latency and overhead.
Upvotes: 2
Reputation: 2467
the Firehouse API would return ALL public tweets - which is probably to much for most applications to handle (and probably also not accessible for free). But instead you could use the Sample API which delivers 3000 sample tweets every minute. See here.
This (or any other Twitter API) is made available as REST API. You can either create your own code that reads the API or you could also use one of the many libraries that are already out there.For a listing of libraries see here.
Regards, Daniel
Upvotes: 1