Reputation: 1
CSV file has four columns:- tweet_id, created_at, tweet_text, tweet_media_url
tweet_text is already UTF-8 encoded
import csv
f = open('tweets.csv')
csv_f = csv.reader(f)
#==============================================================================
tweet_text= []
for row in csv_f:
tweet_text.append(row[2])
#==============================================================================
def deEmojify(inputString):
inputString= inputString.encode('ascii', 'ignore').decode('ascii')
return inputString
#===============================================================================
text1="b'@JWSpry Have some fun with this! \xf0\x9f\x98\x82 I can only post four at a time - a few more are coming."
text2=deEmojify(text1)
print(text2)
output - b'@JWSpry Have some fun with this! I can only post four at a time - a few more are coming.
print(tweet_text[7])
output -b'@JWSpry Have some fun with this! \xf0\x9f\x98\x82 I can only post four at a time - a few more are coming.
text3=deEmojify(tweet_text[7])
print(text3)
output -b'@JWSpry Have some fun with this! \xf0\x9f\x98\x82 I can only post four at a time - a few more are coming.
why code is working fine for text1(which I have just copied and pasted from csv) but not for tweet_text[7]?
Upvotes: 0
Views: 138