Reputation: 10817
The output of the program
# -*- coding: utf-8 -*-
j = "Jürgen"
jlist = [j]
print j, type(j)
print jlist, type(jlist)
is
Jürgen <type 'str'>
['J\xc3\xbcrgen'] <type 'list'>
There is nothing wrong here. \xc3\xbc
is just the utf-8
encoding of ü
. What I'm trying to understand is the difference. Why does the OS X terminal (which otherwise handles utf-8-encoded unicode just fine) and the debugger (PyCharm) display the encoding within the list, but display the actual (un-encoded) character without?
Upvotes: 2
Views: 59
Reputation: 20336
Because print()
uses str()
(pretty printing) to display its strings, str(j)
will appear with the strange character. str(jlist)
, however, will get the string version of the list. The list's __str__
method gets its strings by using repr()
on each. repr()
is the raw format. That means that a tab will be displayed as \t
, not as a bunch of spaces; a new line will be displayed as \n
, not as a new line, etc. The reason for that is that if you wanted to be printing a list, it is probably for debugging or testing. In those cases, you really want to know what is going on in the background.
Upvotes: 2