Zlo
Zlo

Reputation: 1170

Opening non-utf-8 csv file Python 3

I have a csv file that is not utf-8 encoded. And it seems impossible to open it in Python 3. I've tried all kinds of .encode() Windows-1252, ISO-8859-1, latin-1 – every time I get

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 279: invalid start byte

The 0xfc byte is the German ü

I concede that my judgment is impaired since I was fighting with this issue for a long time now. What am I missing? I've always had problems with unicode in Python, but this one just seems especially stubborn.

This is the first time I try to work with Python 3 and as far as I understand there is no .decode() anymore, which could have solved the issue in the second.

EDIT: code to open file:

import unicodecsv as csv
csv.reader(open('myFile.csv', 'r'), delimiter = ';')

Upvotes: 1

Views: 5586

Answers (1)

Serge Ballesta
Serge Ballesta

Reputation: 148965

Simply specify encoding when opening the file:

with open("xxx.csv", encoding="latin-1") as fd:
    rd = csv.reader(fd)
    ...

or with your own code:

csv.reader(open('myFile.csv', 'r', encoding='latin1'), delimiter = ';')

Upvotes: 4

Related Questions