Reputation: 633
temp = "à la Carte"
print type(temp)
utemp = unicode(temp)
The code above results in an error. My goal is to process the temp string and use a find to check if it contains specific string in it but cannot process due to the error:
UnicodeDecodeError: ('unknown', u'\xe0', 0, 1, '')
Upvotes: 0
Views: 767
Reputation: 29926
In python 2, the ordinary string literal cannot hold such unicode characters, so even if the parser manages to get through it, it is still an error. That's why there exists a unicode literal type. So to make it work, first you have to declare the encoding of the python file, and second, use a unicode literal. Like this:
# -*- coding: utf-8 -*-
temp = u"à la Carte"
print type(temp)
utemp = unicode(temp)
Upvotes: 1
Reputation: 16952
You need to specify the encoding: otherwise unicode()
doesn't know what \xe0
means, because that is encoding-specific.
>>> temp = "à la Carte"
>>> utemp = unicode(temp,encoding="Windows-1252")
>>> utemp
u'\xe0 la Carte'
>>> print utemp
à la Carte
Upvotes: 1