brain storm
brain storm

Reputation: 31272

how to print unicode string of chinese characters in python 2.6?

Here is the chinese string I am trying to print out 統計情報. I want to see unicode representation python console. This string is in file. so this is what I tried.

import codecs
with codecs.open("testutf8.txt", "r", "utf-8") as f:
     fa=f.read()
     print fa.encode('utf-8')

This still prints chinese characters in console. I want to see unicode string on console

Thanks

Upvotes: 1

Views: 1098

Answers (1)

wim
wim

Reputation: 363394

The 'unicode-escape' encoding can show you the codepoints:

>>> s = u'統計情報'
>>> print(s.encode('unicode-escape'))
\u7d71\u8a08\u60c5\u5831

But if you want to use those integers directly, it's better to apply ord:

>>> ord(s[0])
32113
>>> 0x7d71
32113
>>> [hex(ord(c)) for c in s]
['0x7d71', '0x8a08', '0x60c5', '0x5831']

What I've described here works on both Python 2 and Python 3.

Upvotes: 3

Related Questions