Reading csv with unicodecsv: UnicodeDecodeError

Question

I have these lines of code:

zf = zipfile.ZipFile(self.temp_file, 'r')
data = zf.open('myfile.csv', mode='r')
result = [link for link in unicodecsv.DictReader(data)]

And here's the exception code:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 13: invalid continuation byte

Input string is:

CAFÉ RESTAURANT

So what am I doing wrong and why unicodecsv can't handle utf-8?

Antti Haapala -- Слава Україні · Accepted Answer

It is because your input is not UTF-8, but Latin-1 (or similar). In UTF-8, É is encoded as 2 bytes: '\xc3\x89'. The error informs that the \xc9 byte was met in the input; this is És encoding in Latin-1 or Win-1252 codepages.

Reading csv with unicodecsv: UnicodeDecodeError

Answers (1)

Related Questions