Dking1199
Dking1199

Reputation: 27

Python read and write 'ß' from file

I have a file.txt with the input

Straße
Straße 1
Straße 2

I want to read this text from file and print it. I tried this, but it won´t work.

lmao1 = open('file.txt').read().splitlines()
lmao =random.choice(lmao1)
print str(lmao).decode('utf8')

But I get the error:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xdf in position 5: invalid continuation byte

Upvotes: 1

Views: 2710

Answers (4)

Mark Tolonen
Mark Tolonen

Reputation: 177971

If on Windows, the file is likely encoded in cp1252.

Whatever the encoding, use io.open and specify the encoding. This code will work in both Python 2 and 3.

io.open will return Unicode strings. It is good practice to immediately convert to/from Unicode at the I/O boundaries of your program. In this case that means reading the file as Unicode in the first place and leaving print to determine the appropriate encoding for the terminal.

Also recommended is to switch to Python 3 where Unicode handling is greatly improved.

from __future__ import print_function
import io
import random
with io.open('file.txt',encoding='cp1252') as f:
    lines = f.read().splitlines()
line = random.choice(lines)
print(line)

Upvotes: 1

Bastian Venthur
Bastian Venthur

Reputation: 16670

You're on the right track, regarding decode, the problem is only there is no way to guess the encoding of a file 100%. Try a different encoding (e.g. latin-1).

Upvotes: 0

Evan
Evan

Reputation: 2301

Got it. If this doesn't work try other common encodings until you find the right one. utf-8 is not the correct encoding.

print str(lmao).decode('latin-1')

Upvotes: 1

jagatjyoti
jagatjyoti

Reputation: 717

It's working fine on Python prompt and while running from python script as well.

>>> import random
>>> lmao =random.choice(lmao1)
>>> lmao =random.choice(lmao1)
>>> print str(lmao).decode('utf8')
Straße 2

The above worked on Python 2.7. May I know your python version ?

Upvotes: -1

Related Questions