Reputation: 9269
I have a set of unicode numbers , I need to convert them to UTF-8 and print the result in to split them in to hex values.
eg: Unicode 0x80 should be converted to UTF-8 and printed as (0xc2,0x80)
I tried following
str(unichr(0x80).encode('utf-8')).split(r'\x')[0]
But it does get split in to ['c2','80']. But it gives me ['\xc2\x80'].
I need this for code generation.
Upvotes: 0
Views: 1236
Reputation: 28934
To generate a list of the hexadecimal values of the characters in your UTF8-encoded string, use the following:
>>> [hex(ord(x)) for x in unichr(0x80).encode('utf-8')]
['0xc2', '0x80']
Upvotes: 2
Reputation: 28268
You try to split with \x
, but \x
doesn't exist in the string. \xc2\x80
are just the escape codes (like \n
for newline) on your screen, I think what you want is this:
print hex(ord(unichr(0x80).encode('utf-8')[0]))
Upvotes: 1
Reputation: 123841
You want like this? could be done with list comprehensions.
>>> ["%x"%ord(x) for x in unichr(0x80).encode('utf-8')]
['c2', '80']
Upvotes: 2