Reputation: 75
I am using the Stream Listener of Tweepy and wanted to retrieve tweets around the current political debate in the UK. Unfortunately, I only get truncated tweets in the case of RTs and responses.
Such as:-
RT @ZaidJilani: Chuck Schumer (sponsor of antiBDS bill) says we should be strangling Gaza. Jeremy Corbyn says oppressing them will…
When the fulltweet should be:-
Chuck Schumer (sponsor of antiBDS bill) says we should be strangling Gaza. Jeremy Corbyn says oppressing them will only radicalize people.
I have seen that there is a way to use `tweet_mode=extended with the regular Twitter.API. However I cannot find something similar with the Streaming API. Has anyone a solution for this? My code is as follows:-
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
from redis import Redis
from rq import Queue
import requests
import time
import io
import os
import json
import threading
import multiprocessing
from datetime import datetime, timedelta
import _credentials
# twitter OAuth
ckey = _credentials.ckey
consumer_secret = _credentials.consumer_secret
access_token_key = _credentials.access_token_key
access_token_secret = _credentials.access_token_secret
#Listener Class Override
class listener(StreamListener):
def __init__(self, start_time, time_limit):
self.time = start_time
self.limit= time_limit
self.tweet_data = []
def on_data(self, data):
localtime = datetime.now().strftime("%Y-%b-%d--%H-%M-%S")
print(localtime)
while (time.time() - self.time) < self.limit:
try:
self.tweet_data.append(data)
return True
except BaseException:
print ('failed ondata')
time.sleep(5)
pass
saveFile = io.open(('raw_tweets_{}.json').format(localtime), 'w', encoding='utf-8')
saveFile.write(u'[\n')
saveFile.write(','.join(self.tweet_data))
saveFile.write(u'\n]')
saveFile.close()
exit()
def on_error(self, status):
print (status)
def on_disconnect(self, notice):
print ('bye')
#Beginning of the specific code
keyword_list = ['Theresa May', 'Jeremy Corbyn', 'GE2017', 'Labour', 'Tory','Tories'] #track list
start_time=time.time()
auth = OAuthHandler(ckey, consumer_secret) #OAuth object
auth.set_access_token(access_token_key, access_token_secret)
twitterStream = Stream(auth, listener(start_time, time_limit=10)) #initialize Stream object with a time out limit
twitterStream.filter(track=keyword_list, languages=['en']) #call the filter method to run the Stream Listener
Upvotes: 0
Views: 3173
Reputation: 579
Now that some time has passed, I think that full text is supported.
At this link:
https://developer.twitter.com/en/docs/tweets/tweet-updates
It says compatibility is supported by default. My (probably ugly) code that shows how I handle it is here:
if 'extended_tweet' in raw_tweepy_data_object:
if 'full_text' in raw_tweepy_data_object['extended_tweet']:
text = raw_tweepy_data_object['extended_tweet']['full_text']
else:
pass # i need to figure out what is possible here
elif 'text' in raw_tweepy_data_object:
text = raw_tweepy_data_object['text']
Upvotes: 1
Reputation: 2153
update: support for tweet_mode = 'extended' appears to be added.
self.stream = Stream(auth = auth, listener = self, tweet_mode= 'extended')
tweet_data = json.loads(data)
if "extended_tweet" in tweet_data:
tweet = tweet_data['extended_tweet']['full_text']
PS. Excuse the formatting, spelling mistakes ect. I'm new to stack overflow and just wish to help others facing this problem.
Upvotes: 5