Reputation: 21
I am encoding Chinese characters using gb18030 in python. I want to access part of the encoded string. For example, the string for 李 is: '\xc0\xee'. I want to extract 'c0' and 'ee' out of this. However, python is not treating '\xc0\xee' as a 8 character string, but as a 2 character string. How I do turn it into a 8 character string so that I could access the individual roman letters in it?
Upvotes: 1
Views: 21
Reputation: 2558
How about this:
li = "李"
values = str(li.encode('gb18030'))
values = [i.strip("'") for i in values.split("\\x")[1:]]
print(values)
['c0', 'ee']
How do you use repr()
to get the values you are looking for?
Upvotes: 0