Reputation: 12508
What is the correct way to convert '\xbb' into a unicode string? I have tried the following and only get UnicodeDecodeError:
unicode('\xbb', 'utf-8')
'\xbb'.decode('utf-8')
Upvotes: 0
Views: 1018
Reputation: 8851
Not sure what you are trying to do. But in Python3 all strings are unicode per default. In Python2.X you have to use u'my unicode string \xbb'
(or double, tripple quoted) to get unicode strings. When you want to print unicode strings you have to encode them in character set that is supported on the output device, eg. the terminal. u'my unicode string \xbb'.endoce('iso-8859-1')
for instance.
Upvotes: 0
Reputation: 799310
Since it comes from Word it's probably CP1252.
>>> print '\xbb'.decode('cp1252')
»
Upvotes: 8
Reputation: 12279
It looks to be Latin-1 encoded. You should use:
unicode('\xbb', 'Latin-1')
Upvotes: 1