Reputation: 27
I have a file.txt with the input
Straße
Straße 1
Straße 2
I want to read this text from file and print it. I tried this, but it won´t work.
lmao1 = open('file.txt').read().splitlines()
lmao =random.choice(lmao1)
print str(lmao).decode('utf8')
But I get the error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xdf in position 5: invalid continuation byte
Upvotes: 1
Views: 2710
Reputation: 177971
If on Windows, the file is likely encoded in cp1252
.
Whatever the encoding, use io.open
and specify the encoding. This code will work in both Python 2 and 3.
io.open
will return Unicode strings. It is good practice to immediately convert to/from Unicode at the I/O boundaries of your program. In this case that means reading the file as Unicode in the first place and leaving print
to determine the appropriate encoding for the terminal.
Also recommended is to switch to Python 3 where Unicode handling is greatly improved.
from __future__ import print_function
import io
import random
with io.open('file.txt',encoding='cp1252') as f:
lines = f.read().splitlines()
line = random.choice(lines)
print(line)
Upvotes: 1
Reputation: 16670
You're on the right track, regarding decode
, the problem is only there is no way to guess the encoding of a file 100%. Try a different encoding (e.g. latin-1
).
Upvotes: 0
Reputation: 2301
Got it. If this doesn't work try other common encodings until you find the right one. utf-8 is not the correct encoding.
print str(lmao).decode('latin-1')
Upvotes: 1
Reputation: 717
It's working fine on Python prompt and while running from python script as well.
>>> import random
>>> lmao =random.choice(lmao1)
>>> lmao =random.choice(lmao1)
>>> print str(lmao).decode('utf8')
Straße 2
The above worked on Python 2.7. May I know your python version ?
Upvotes: -1