Reputation: 1992
I am using Twython twitter API to extract tweets. But I am getting 100 tweets only. I want to extract tweets from 10Dec 2013 to 10March 2014. I have mentioned count=1000 in the search function.
The rate limit is 100 I get that. Is there a way to get those tweets between the given period of time without any rate limit.
from twython import Twython
import csv
from dateutil import parser
from dateutil.parser import parse as parse_date
import datetime
from datetime import datetime
import pytz
utc=pytz.UTC
APP_KEY = 'xxxxxxxxxxx'
APP_SECRET = 'xxxxxxxxxxx'
OAUTH_TOKEN = 'xxxxxxxx' # Access Token here
OAUTH_TOKEN_SECRET = 'xxxxxxxxxxx'
t = Twython(app_key=APP_KEY, app_secret=APP_SECRET, oauth_token=OAUTH_TOKEN, oauth_token_secret=OAUTH_TOKEN_SECRET)
search=t.search(q='AAPL', count="1000",since='2013-12-10')
tweets= search['statuses']
for tweet in tweets:
do something
Upvotes: 0
Views: 2489
Reputation:
With Twython, the search API is limited but I have had success just using get_user_timeline.
I solved a similar problem where I wanted to grab the last X amount of tweets from a user.
If you read the documentation, the trick that worked for me is keeping track of the id of the last tweet I read and reading up until that tweet on my next request using max_id.
For your case, you would just need to modify the while loop to stop on some condition for 'created_at'. Something like this could work:
# Grab the first 200 tweets
last_id = 0
full_timeline = 200
result = t.get_user_timeline(screen_name='NAME', count = full_timeline)
for tweet in result:
print(tweet['text'], tweet['created_at'])
last_id = tweet['id']
# Update full timeline to see how many tweets were actually received
# Full timeline will be less than 200 if we read all the users tweets
full_timeline = len(result)
# 199 cause result[1:] is used to trim duplicated results cause of max_id
while full_timeline >= 199:
result = t.get_user_timeline(screen_name='NAME', count = 200, max_id = last_id)
# Since max_id is inclusive with its bound, it will repeat the same tweet we last read, so trim out that tweet
result = result[1:]
for tweet in result:
print(tweet['text'], tweet['created_at'])
last_id = tweet['id']
# Update full_timeline to keep loop going if there are leftover teweets
full_timeline = len(result)
Upvotes: 1
Reputation: 3881
There's a limitation while accessing the tweets through Search API
. Have a look at this Documentation.
The
Search API
usually only serves tweets from the past week.
As you're trying to retrieve the tweets from past 3/4 months, you are not getting the old tweets.
Upvotes: 3