Nima
Nima

Reputation: 71

Retrieving more than (~3000) tweets per userId from twitter API.

I am new in twitter development. I am trying to download tweets of important news agency. I used the guidelines provided in http://www.karambelkar.info/2015/01/how-to-use-twitters-search-rest-api-most-effectively. to download the tweets. I know that twitter api has some limitations on the number of requests (180 req per 15 min) and each request can fetch at most 100 tweets. So I expect the following code to get 18K tweets when I run it for the first time. However, I can only get arround 3000 tweets for each news agency. For example nytimes 3234 tweets, cnn 3207. I'll be thankful if you can take a look at my code and let me know the problem.

def get_tweets(api, username, sinceId):
        max_id = -1L
        maxTweets = 1000000 # Some arbitrary large number
        tweetsPerReq = 100  # the max the API permits
    tweetCount = 0

    print "writing to {0}_tweets.txt".format(username)
    with open("{0}_tweets.txt".format(username) , 'w') as f:
            while tweetCount < maxTweets:
                try:
                    if (max_id <= 0):
                        if (not sinceId):
                            new_tweets = api.user_timeline(screen_name = username, count= tweetsPerReq)
                        else:
                            new_tweets = api.user_timeline(screen_name = username, count= tweetsPerReq, since_id = sinceId)                              
                    else:
                        if (not sinceId):
                            new_tweets = api.user_timeline(screen_name = username, count= tweetsPerReq, max_id=str(max_id - 1))
                        else:
                            new_tweets = api.search(screen_name = username, count= tweetsPerReq, max_id=str(max_id - 1), since_id=sinceId)

                    if not new_tweets:
                        print "no new tweet"
                        break
                    #create array of tweet information: username, tweet id, date/time, text

                    for tweet in new_tweets:
                        f.write(jsonpickle.encode(tweet._json, unpicklable=False) +'\n')


                    tweetCount += len(new_tweets)
                    print("Downloaded {0} tweets".format(tweetCount))
                    max_id = new_tweets[-1].id
                except tweepy.TweepError as e:
                    # Just exit if any error
                    print("some error : " + str(e))
                    break                       


        print ("Downloaded {0} tweets, Saved to {1}_tweets.txt".format(tweetCount, username))

Upvotes: 3

Views: 713

Answers (1)

Terence Eden
Terence Eden

Reputation: 14324

Those are the limitations imposed by the API.

If you read the documentation, you will see that it says

This method can only return up to 3,200 of a user’s most recent Tweets.

So, the answer is - normal API users cannot access that data.

Upvotes: 3

Related Questions