MichM
MichM

Reputation: 896

Python 2.7 prints Unicode as a square box in the Python console

I use the latest Python 2 with pycharm on a Mac.

In the Python console, I noticed that if I do print u'\u31d4', the Python console prints out a half square box, ㇔. So does print u'\u31d2'. But 31d4 should be a cjk stroke, as seen in http://unicode-table.com/en/search/?q=31d4; and 31d2 should be a different stroke, as in http://unicode-table.com/en/search/?q=31d2.

Questions:

  1. What can I do so that the Python console prints out these strokes correctly?

  2. A related question: the Python console currently doesn't print out Unicode characters by default, unless I explicitly call print. For example:

    (console prompt)>>> a = u'\u4e00'

    (console prompt)>>> a

The console prints out u'\u4e00'.

Only if I explicitly use print a would I get back. Can I change a setting somewhere so that it prints in response to typing a in the console, without me having to call print?

Upvotes: 1

Views: 2943

Answers (3)

donkopotamus
donkopotamus

Reputation: 23186

Question 2

What is displayed by the interpreter is governed by the function sys.displayhook. Loosely speaking, the default display hook displays the repr of the value unless it is None.

To alter the displayhook simply set to another function. For example:

>>> a = u'\u4e00'
u'\u4e00'
>>> import sys
>>> def my_display(x):
...     if isinstance(x, unicode):
...         sys.stdout.write(x.encode("utf-8"))
...     else:
...         sys.stdout.write(repr(x))
...     sys.stdout.write("\n")

>>> sys.displayhook = my_display
>>> a
>>> 一

Upvotes: 1

Mark Tolonen
Mark Tolonen

Reputation: 177725

Question 1 depends on your IDE's font support. You get replacement characters if the font doesn't support the character. Get better fonts and/or a better IDE.

Question 2: That is Python 2's default for the interactive console: ASCII output with escape codes for non-ASCII. Python 3 still quotes the string, but prints supporting Unicode characters. print is the correct way to render the string. You can't change the default. It is that way for debugging. Consider:

>>> s
u'\xa0\xa0\xa0'
>>> print s

>>>

How would you know what the content of s was otherwise? The first way you know it is three Unicode characters, and they are the Unicode codepoint U+00A0, but rendering three non-breaking spaces doesn't tell you much.

Upvotes: 1

Nhan Hoang
Nhan Hoang

Reputation: 147

Take a look at iPython QtConsole, it supports Unicode output very well

Upvotes: 1

Related Questions