Reputation: 55
Outputting tweets to a CSV file and want to separate the text portion to have each word in a new column so i can run it through a classifier using python
for tweet in alltweets:
#Loop to only return the tweets that have been posted in the last 24 hours
if (datetime.datetime.now() - tweet.created_at).days < 1:
# transform the tweepy tweets into a 2D array that will populate the csv
outtweets.append([tweet.user.name, tweet.created_at, tweet.text.encode("utf-8")])
else:
deadend = True
return
if not deadend:
page += 1
# write the csv
with open('%s_tweets.csv' % screen_name, 'w') as f:
writer = csv.writer(f)
writer.writerow(["name", "created_at", "text"])
writer.writerows(outtweets)
pass
** EDIT 2 **
outtweets.append(list(itertools.chain([tweet.user.name, tweet.created_at],tweet.text.encode("utf-8").split(' '))))
TypeError: a bytes-like object is required, not 'str'
Upvotes: 0
Views: 3507
Reputation: 18808
Since tweet.text.encode("utf-8") is one string, you can split it (by space) to convert it into individual words before writing it out.
tweets = [['user1','text of tweet 1'],['user2','text of tweet2']]
import itertools
for tweet in tweets:
print list(itertools.chain([tweet[0]], tweet[1].split(' ')))
['user1', 'text', 'of', 'tweet', '1']
['user2', 'text', 'of', 'tweet2']
Try this in your code, in place of the current outtweets.append
outtweets.append(list(itertools.chain([tweet.user.name, tweet.created_at],tweet.text.encode("utf-8").split(' ')))
The above code builds two lists, one with all the old attributes and one with the words in the tweet text and then merges them into one list.
Upvotes: 1