Reputation: 23
I want to read some cyrilic text from a txt file in Python 3. This is what the text file contains.
абцдефгчийклмнопярстувшхыз
I used:
with open('text.txt', 'r') as myfile:
text=myfile.read()
print (text)
But this is the ouput in the python shell:
ÿþ01F45D3G89:;<=>?O@ABC2HEK7
Can someone explain why this is the output?
Upvotes: 2
Views: 12680
Reputation:
Python supports utf-8 for this sort of thing.
You should be able to do:
with open('text.txt', encoding = 'utf-8', mode = 'r') as my_file:
...
Also, be sure that your text file is saved with utf-8 encoding. I tested this in my shell and without proper encoding my output was:
?????????????????????
With proper encoding:
file = open('text.txt', encoding='utf-8', mode='r')
text = file.read()
print(text)
абцдефгчийклмнопярстувшхы
Upvotes: 6
Reputation: 1215
Try working on the file using codecs, you need to
import codecs
and then do
text = codecs.open('text.txt', 'r', 'utf-8')
Basically you need utf8
Upvotes: 1