Reputation: 31272
Here is the chinese string I am trying to print out 統計情報
. I want to see unicode representation python console. This string is in file.
so this is what I tried.
import codecs
with codecs.open("testutf8.txt", "r", "utf-8") as f:
fa=f.read()
print fa.encode('utf-8')
This still prints chinese characters in console. I want to see unicode string on console
Thanks
Upvotes: 1
Views: 1098
Reputation: 363394
The 'unicode-escape' encoding can show you the codepoints:
>>> s = u'統計情報'
>>> print(s.encode('unicode-escape'))
\u7d71\u8a08\u60c5\u5831
But if you want to use those integers directly, it's better to apply ord
:
>>> ord(s[0])
32113
>>> 0x7d71
32113
>>> [hex(ord(c)) for c in s]
['0x7d71', '0x8a08', '0x60c5', '0x5831']
What I've described here works on both Python 2 and Python 3.
Upvotes: 3