Reputation: 2600
Hopefully someone can help me out with the following. It is probably not too complicated but I haven't been able to figure it out. My "output.txt" file is created with:
f = open('output.txt', 'w')
print(tweet['text'].encode('utf-8'))
print(tweet['created_at'][0:19].encode('utf-8'))
print(tweet['user']['name'].encode('utf-8'))
f.close()
If I don't encode it for writing to file, it will give me errors. So "output" contains 3 rows of utf-8 encoded output:
b'testtesttest'
b'line2test'
b'\xca\x83\xc9\x94n ke\xc9\xaan'
In "main.py", I am trying to convert this back to a string:
f = open("output.txt", "r", encoding="utf-8")
text = f.read()
print(text)
f.close()
Unfortunately, the b'' - format is still not removed. Do I still need to decode it? If possible, I would like to keep the 3 row structure. My apologies for the newbie question, this is my first one on SO :)
Thank you so much in advance!
Upvotes: 4
Views: 11301
Reputation: 2600
With the help of the people answering my question, I have been able to get it to work. The solution is to change the way how to write to file:
tweet = json.loads(data)
tweet_text = tweet['text'] # content of the tweet
tweet_created_at = tweet['created_at'][0:19] # tweet created at
tweet_user = tweet['user']['name'] # tweet created by
with open('output.txt', 'w', encoding='utf-8') as f:
f.write(tweet_text + '\n')
f.write(tweet_created_at+ '\n')
f.write(tweet_user+ '\n')
Then read it like:
f = open("output.txt", "r", encoding='utf-8')
tweettext = f.read()
print(text)
f.close()
Upvotes: 3
Reputation: 5474
If b
and the quote '
are in your file, that means this in a problem with your file. Someone probably did write(print(line))
instead of write(line)
. Now to decode it, you can use literal_eval
. Otherwise @m_callens answer's should be ok.
import ast
with open("b.txt", "r") as f:
text = [ast.literal_eval(line) for line in f]
for l in text:
print(l.decode('utf-8'))
# testtesttest
# line2test
# ʃɔn keɪn
Upvotes: 0
Reputation: 6360
Instead of specifying the encoding when opening the file, use it to decode as you read.
f = open("output.txt", "rb")
text = f.read().decode(encoding="utf-8")
print(text)
f.close()
Upvotes: 1