Samer Aamar
Samer Aamar

Reputation: 1408

Using python, how to use collect tweets (using tweepy) between two dates?

How can i use python and tweepy in order to collect tweets from twitter that are between two given dates?

is there a way to pass from...until... values to the search api?


Note: I need to be able to search back but WITHOUT limitation to a specific user

i am using python and I know that the code should be something like this but i need help to make it work.


    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token_key, access_token_secret)
    api = tweepy.API(auth)

    collection = []
    for tweet in tweepy.Cursor(api.search, ???????).items():
        collection[tweet.id] = tweet._json

Upvotes: 1

Views: 3448

Answers (3)

Jay Vogel
Jay Vogel

Reputation: 1

As of now, Tweepy is not the best solution. The best solution is using the python library SnScrape, which scrapes twitter, and can therefore get tweets after the 2-week cap twitter sets. The below code only scrapes for 100 English tweets between dates and only gets the tweet ID, but it can be easily extended for more specific searches, more or fewer tweets, or to get more information about the tweet.

import snscrape.modules.twitter as sntwitter

tweetslist = []

params="'"+"lang:en "+"since:2020-11-1"+" until:2021-03-13"+"'" 

for i,tweet in enumerate(sntwitter.TwitterSearchScraper(params).get_items()):
    if i>100:
        break
    tweetslist.append([tweet.id])

print(tweetslist)

Upvotes: 0

Samer Aamar
Samer Aamar

Reputation: 1408

After long hours of investigations and stabilization i can gladly share my findings.

  • search by geocode: pass the geocode parameter in the 'q' parameter in this format: geocode:"37.781157,-122.398720,500mi" , the double quotes are important. notice that the parameter near is not supported anymore by this api. The geocode gives more flexibility

  • search by timeline: use the parameters "since" and "until" in the following format: "since:2016-08-01 until:2016-08-02"

there is one more important note... twitter don't allow queries with too old dates. I am not sure but i think they give only 10-14 days back. So you cannot query this way for tweets of last month.

===================================

for status in tweepy.Cursor(api.search,
                       q='geocode:"37.781157,-122.398720,1mi" since:2016-08-01 until:2016-08-02 include:retweets',
                       result_type='recent',
                       include_entities=True,
                       monitor_rate_limit=False, 
                       wait_on_rate_limit=False).items(300):
    tweet_id = status.id
    tweet_json = status._json

Upvotes: 6

Timofey Chernousov
Timofey Chernousov

Reputation: 1294

You have to use max_id parameters as described in twitter documentation

tweepy is a wrapper around twitter API so you should be able to use this parameter.

As per geolocation, take look at The Search API: Tweets by Place. It uses same search API, with customized keys.

Upvotes: -1

Related Questions