Reputation: 1992

Twython to extract tweets

I am using Twython twitter API to extract tweets. But I am getting 100 tweets only. I want to extract tweets from 10Dec 2013 to 10March 2014. I have mentioned count=1000 in the search function.

The rate limit is 100 I get that. Is there a way to get those tweets between the given period of time without any rate limit.

 from twython import Twython
 import csv
 from dateutil import parser
 from dateutil.parser import parse as parse_date
 import datetime
 from datetime import datetime
 import pytz

 utc=pytz.UTC

 APP_KEY = 'xxxxxxxxxxx'    
 APP_SECRET = 'xxxxxxxxxxx'
 OAUTH_TOKEN = 'xxxxxxxx'  # Access Token here
 OAUTH_TOKEN_SECRET = 'xxxxxxxxxxx'  

 t = Twython(app_key=APP_KEY, app_secret=APP_SECRET, oauth_token=OAUTH_TOKEN,      oauth_token_secret=OAUTH_TOKEN_SECRET)

 search=t.search(q='AAPL', count="1000",since='2013-12-10')
 tweets= search['statuses']


 for tweet in tweets:
     do something

Upvotes: 0

Answers (2)

user7542393

Reputation:

With Twython, the search API is limited but I have had success just using get_user_timeline.

I solved a similar problem where I wanted to grab the last X amount of tweets from a user.

If you read the documentation, the trick that worked for me is keeping track of the id of the last tweet I read and reading up until that tweet on my next request using max_id.

For your case, you would just need to modify the while loop to stop on some condition for 'created_at'. Something like this could work:

# Grab the first 200 tweets
last_id = 0
full_timeline = 200
result = t.get_user_timeline(screen_name='NAME', count = full_timeline)

for tweet in result:
    print(tweet['text'], tweet['created_at'])
    last_id = tweet['id']

# Update full timeline to see how many tweets were actually received
# Full timeline will be less than 200 if we read all the users tweets
full_timeline = len(result)

# 199 cause result[1:] is used to trim duplicated results cause of max_id
while full_timeline >= 199:
    result = t.get_user_timeline(screen_name='NAME', count = 200, max_id = last_id)

    # Since max_id is inclusive with its bound, it will repeat the same tweet we last read, so trim out that tweet
    result = result[1:]
    for tweet in result:
        print(tweet['text'], tweet['created_at'])
        last_id = tweet['id']

    # Update full_timeline to keep loop going if there are leftover teweets
    full_timeline = len(result)

Upvotes: 1

Yuva Raj

Reputation: 3881

There's a limitation while accessing the tweets through Search API. Have a look at this Documentation.

The Search API usually only serves tweets from the past week.

As you're trying to retrieve the tweets from past 3/4 months, you are not getting the old tweets.

Upvotes: 3

Twython to extract tweets

Answers (2)

Related Questions