Reputation: 185
I'm using python 2.7. I've tried many things like codecs but didn't work. How can I fix this.
myfile.txt
wörd
My code
f = open('myfile.txt','r')
for line in f:
print line
f.close()
Output
s\xc3\xb6zc\xc3\xbck
Output is same on eclipse and command window. I'm using Win7. There is no problem with any characters when I don't read from a file.
Upvotes: 6
Views: 27763
Reputation: 20308
from chardet import detect
encoding = lambda x: detect(x)['encoding']
print encoding(line)
n_line=unicode(line,encoding(line),errors='ignore')
print n_line
print n_line.encode('utf8')
Upvotes: 7
Reputation: 161
import codecs
#open it with utf-8 encoding
f=codecs.open("myfile.txt","r",encoding='utf-8')
#read the file to unicode string
sfile=f.read()
#check the encoding type
print type(file) #it's unicode
#unicode should be encoded to standard string to display it properly
print sfile.encode('utf-8')
#check the type of encoded string
print type(sfile.encode('utf-8'))
Upvotes: 16
Reputation: 1223
It's the terminal encoding. Try to configure your terminal with the same encoding you are using in your file. I recomend you to use UTF-8.
By the way, is a good practice to decode-encode all your inputs-outputs to avoid problems:
f = open('test.txt','r')
for line in f:
l = unicode(line, encoding='utf-8')# decode the input
print l.encode('utf-8') # encode the output
f.close()
Upvotes: 1