Reputation: 374
Using the tweepy
python library, how can I stop streaming tweets after x seconds?
The StreamListener
from tweepy.streaming
continuously collects data until the user manually shuts down the program. However, I only want to collect tweets for a user defined time interval.
Upvotes: 4
Views: 3561
Reputation: 374
There are multiple ways to solve this problem - multi-threading and creating a user defined StreamListener
. I will highlight one way to solve this and explain why I feel it is the best.
There is no need to create any user defined instances of classes unless you want to override the built-in functionality (for storing tweets)
import tweepy
import time
from tweepy import Stream, StreamListener, OAuthHandler
'''Authenticate user'''
CONSUMER_KEY = 'xxxxxx'
CONSUMER_SECRET = 'xxxxxx'
auth = OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
'''How long do you want to stream tweets (in seconds)'''
runtime = 60 #this means one minute
'''Start streaming'''
twitterstream = Stream(auth, StreamListener())
twitterstream.filter(track=['twitter'], async=True) #apply any filter you want
time.sleep(runtime) #halts the control for runtime seconds
twitterstream.disconnect() #disconnect the stream and stop streaming
This is a simple and elegant solution and works for all streams. There is no (complex) multi-threading involved.
Another common method I found across stackoverflow and many other websites refer to starting a timer inside a user defined StreamListener
and checking whether the time limit has exceeded in the self.on_data()
method. While this is a neat hack for high volume streams, it checks for the time limit exceeding only when the stream receives a tweet. This can be quite a huge problem if you are streaming low volume streams (when not many people are tweeting with the filter you applied).
Upvotes: 4