Reputation: 1170
I have a csv file that is not utf-8
encoded. And it seems impossible to open it in Python 3. I've tried all kinds of .encode()
Windows-1252
, ISO-8859-1
, latin-1
– every time I get
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 279: invalid start byte
The 0xfc
byte is the German ü
I concede that my judgment is impaired since I was fighting with this issue for a long time now. What am I missing? I've always had problems with unicode in Python, but this one just seems especially stubborn.
This is the first time I try to work with Python 3 and as far as I understand there is no .decode()
anymore, which could have solved the issue in the second.
EDIT: code to open file:
import unicodecsv as csv
csv.reader(open('myFile.csv', 'r'), delimiter = ';')
Upvotes: 1
Views: 5586
Reputation: 148965
Simply specify encoding when opening the file:
with open("xxx.csv", encoding="latin-1") as fd:
rd = csv.reader(fd)
...
or with your own code:
csv.reader(open('myFile.csv', 'r', encoding='latin1'), delimiter = ';')
Upvotes: 4