UTF-8 encoding in str type Python 2

Question

I have a Python 2.7 code which retrieves a base64 encoded response from a server. This response is decoded using base64 module (b64decode / decodestring functions, returning str). Its decoded content has the Unicode code points of the original strings.

I need to convert these Unicode code points to UTF-8.

The original string has a substring content "Não". When I decode the responded string, it shows:

>>> encoded_str = ... # server response
>>> decoded_str = base64.b64decode(encoded_str)
>>> type(decoded_str)

>>> decoded_str[x:y]
'N\xe3o'

When I try to encode to UTF-8, it leads to errors as

>>> (decode_str[x:y]).encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 2: ordinal not in range(128)

However, when this string is manually written in Unicode type, I can correctly convert it to my desired UTF-8 string.

>>> test_str = u'N\xe3o'
>>> test.encode('utf-8')
'N\xc3\xa3o'

I have to retrieve this response from the server and correctly generate an UTF-8 string which can be printed as "Não", how can I do this in Python 2?

UTF-8 encoding in str type Python 2

Answers (1)

Related Questions