papico
papico

Reputation: 39

str to bytes in Python3.3

How can I get b'\xe3\x81\x82' from '\xe3\x81\x82'?

Finally, I want u'\u3042', which means Japanese letter 'あ',

b'\xe3\x81\x82'.decode('utf-8') makes u'\u3042' but

'\xe3\x81\x82'.decode('utf-8') causes the following error

AttributeError: 'str' object has no attribute 'decode'

because b'\xe3\x81\x82' is bytes and '\xe3\x81\x82' is str.

I have DB with data like '\xe3\x81\x82'.

Upvotes: 3

Views: 1377

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121266

If you have bytes disguising as Unicode code points, encode to Latin-1:

'\xe3\x81\x82'.encode('latin1').decode('utf-8')

Latin-1 (ISO-8859-1) maps Unicode codepoints one-on-one to bytes:

>>> '\xe3\x81\x82'.encode('latin1').decode('utf-8')
'あ'

Upvotes: 4

Related Questions