Reputation: 15
I am trying to collect tweets with tweepy from a list of tweet ids good_tweet_ids_test
, using statuses_lookup
.
Since the list is a bit old, some tweets will have been deleted by now. Therefore I ignore errors in the lookup_tweets function, so it does not stop each time.
Here is my code so far:
def lookup_tweets(tweet_IDs, api):
full_tweets = []
tweet_count = len(tweet_IDs)
try:
for i in range((tweet_count // 100) + 1):
# Catch the last group if it is less than 100 tweets
end_loc = min((i + 1) * 100, tweet_count)
full_tweets.extend(
api.statuses_lookup(tweet_IDs[i * 100:end_loc], tweet_mode='extended')
)
return full_tweets
except:
pass
results = lookup_tweets(good_tweet_ids_test, api)
temp = json.dumps([status._json for status in results]) #create JSON
newdf = pd.read_json(temp, orient='records')
newdf.to_json('tweepy_tweets.json')
But when I run the temp = json.dumps([status._json for status in results])
line, it gives me the error:
TypeError: 'NoneType' object is not iterable
I do not know how to fix this. I believe the type of some of the statuses is None, because they have been deleted and can therefore not be looked up now. I simply wish for my code to move on to the next status, if the type is None.
EDIT: As have been pointed out, the issue is that results
is None
. So now I think I need to exclude None
values from the full_tweets
variable. But I cannot figure out how to. Any help?
EDIT2: With further testing I have found out that results
is only None
when there is a tweet ID that has now been deleted in the batch. If the batch contains only active tweets, it works. So I think I need to figure out how to have my function look up the batch of tweets, and only return those that are not None
. Any help on this?
Upvotes: 0
Views: 180
Reputation: 312136
Instead of implicitly returning None
when there's an error, you could explicitly return an empty list. That way, the result of lookup_tweets
will always be iterable, and the calling code won't have to check its result:
def lookup_tweets(tweet_IDs, api):
full_tweets = []
tweet_count = len(tweet_IDs)
try:
for i in range((tweet_count // 100) + 1):
# Catch the last group if it is less than 100 tweets
end_loc = min((i + 1) * 100, tweet_count)
full_tweets.extend(
api.statuses_lookup(tweet_IDs[i * 100:end_loc], tweet_mode='extended')
)
return full_tweets
except:
return [] # Here!
Upvotes: 0