Reputation: 359
So I am currently trying to gather tweets on a specific location and then analyse what is going on in that location from the tweets gathered. My task basically involves a lot of data mining.
The main problem I have come across however is gathering enough tweets that will allow me to make a judgement.
I have been using the Twitter Streaming API, however this only gives 1% of all the tweets which is far from enough. I mined 100,000 tweets and very little were in English let alone related to the location I was looking for.
I have also noticed that twitter rate limits how often you can call a method via their API. How are sites like trendsmap.com working? Are they somehow accessing a larger data set?
Edit: Ok, so I have tried to use the geolocation feature in the twiiter4j API. Turns out the rate limits can be avoided if you are careful with your implementation. The amount of people however that actually have the geolocation feature turned on when tweeting is very low. This therefore does not represent people in that area. I seem to be getting the same tweets every single time. Twitter does offer a search operator "near" which works great on their website. However they have not included this functionality in their API as far as I can tell.
Upvotes: 2
Views: 648
Reputation: 14324
If you are searching using the Twitter API you can restrict your searches to a specific geolocation using the geocode
option.
You can use result_type=recent
to ensure you're only getting the most recent tweets.
The maximum count
- that is, number of tweets per request - is 100.
The current limit on number of search requests per hour is 450.
So, that's a maximum of 45,000 tweets per hour - is that enough for you?
tl:dr - use the most restrictive set of search parameters to limit the results to those you actually need.
Upvotes: 2