Reputation: 63
Hi I receive texts from via library, when I print the received text I see some non-english characters as "\u00e7" which must be "ç" instead. I guess somehow I need to encode and re-decode the text, but I am very new to python and I do not if it is the right approach. Can you please enlighten my way?
Upvotes: 6
Views: 7123
Reputation: 369274
Decode the string using unicode_escape
encoding:
>>> s = r'\u00e7'
>>> print s
\u00e7
>>> print s.decode('unicode-escape')
ç
>>>
If sys.stdout.encoding
is ascii
, print will raise UnicodeEncodeError
; In such case, encode it explicitly:
>>> print s.decode('unicode-escape').encode('utf-8')
ç
Upvotes: 5