user2937801
user2937801

Reputation: 1

How could list decode to 'UTF-8'

I got a list = [0x97, 0x52], not unicode object. this is unicode of a charactor '青'(u'\u9752'). How could I change this list to unicode object first, then encode to 'UTF-8'?

Upvotes: 0

Views: 90

Answers (2)

tripleee
tripleee

Reputation: 189357

Not sure if this is the most elegant way, but it works for this particular example.

>>> ''.join([chr(x) for x in [0x97, 0x52]]).decode('utf-16be')
u'\u9752'

Upvotes: 0

Eser Aygün
Eser Aygün

Reputation: 8004

bytes = [0x97, 0x52]

code = bytes[0] * 256 + bytes[1]  # build the 16-bit code
char = unichr(code)               # convert code to unicode
utf8 = char.encode('utf-8')       # encode unicode as utf-8
print utf8                        # prints '青'

Upvotes: 2

Related Questions