Jacek Sierajewski
Jacek Sierajewski

Reputation: 633

UnicodeDecodeError: ('unknown', u'\xe0', 0, 1, '')

temp = "à la Carte"
print type(temp)
utemp = unicode(temp)

The code above results in an error. My goal is to process the temp string and use a find to check if it contains specific string in it but cannot process due to the error:

UnicodeDecodeError: ('unknown', u'\xe0', 0, 1, '')

Upvotes: 0

Views: 767

Answers (2)

Tamas Hegedus
Tamas Hegedus

Reputation: 29926

In python 2, the ordinary string literal cannot hold such unicode characters, so even if the parser manages to get through it, it is still an error. That's why there exists a unicode literal type. So to make it work, first you have to declare the encoding of the python file, and second, use a unicode literal. Like this:

# -*- coding: utf-8 -*-
temp = u"à la Carte"
print type(temp)
utemp = unicode(temp)

Upvotes: 1

BoarGules
BoarGules

Reputation: 16952

You need to specify the encoding: otherwise unicode() doesn't know what \xe0 means, because that is encoding-specific.

>>> temp = "à la Carte"
>>> utemp = unicode(temp,encoding="Windows-1252")
>>> utemp
u'\xe0 la Carte'
>>> print utemp
à la Carte

Upvotes: 1

Related Questions