AngelIW
AngelIW

Reputation: 23

Is this Python unicode escape error?

I'm learning Python encoding stuffs, I met following situation which is wired for me and I want to know why.

First of all, this is my environment: OSX 10.10.3

Output of the command echo $LC_CTYPE, $LANG is: en_US.UTF-8, en_US.UTF-8

Output of python --version is Python 2.7.6

Then I type python to enter python shell:

>>> import sys; reload(sys); sys.setdefaultencoding('utf8')
<module 'sys' (built-in)>
>>> s16 = u'我'.encode('utf16')
>>> s16
'\xff\xfe\x11b'
>>> for c in s16:
...   ord(c)
... 
255
254
17
98
>>> s16_ = '\xff\xfe\x11\x62'
>>> s16_
'\xff\xfe\x11b'

So my question is: For the last line and the 4th line, why Python output '\xff\xfe\x11b' instead of '\xff\xfe\x11\x62'?

Upvotes: 0

Views: 101

Answers (2)

Yu Hao
Yu Hao

Reputation: 122493

b is a printable character, so repr() will show the character itself, not the escaped form.


Reference: str.isprintable:

Note that printable characters in this context are those which should not be escaped when repr() is invoked on a string.

Upvotes: 0

lvc
lvc

Reputation: 35089

When Python prints bytes (str in Python 2), it prints the corresponding ASCII character for that byte if it is printable, and hex escapes it otherwise.

\x62 corresponds to ASCII 'b'. You can see this by just looking at that byte:

>>> '\x62'
'b'

Upvotes: 3

Related Questions