Johannes Schwaninger
Johannes Schwaninger

Reputation: 221

Python list items encoding

Why is it, that the encoding changes in Python 2.7 when I iterate over the items of a list?

test_list = ['Hafst\xc3\xa4tter', '[email protected]']

Printing the list:

print(test_list)

gets me this output:

['Hafst\xc3\xa4tter', '[email protected]']

So far, so good. But why is it, that when I iterate over the list, such as:

for item in test_list:
    print(item)

I get this output:

Hafstätter
[email protected]

Why does the encoding change (does it?? And how can I change the encoding within the list?

Upvotes: 0

Views: 96

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 177575

The encoding isn't changing, they are just different ways of displaying a string. One shows the non-ASCII bytes as escape codes for debugging:

>>> test_list = ['Hafst\xc3\xa4tter', '[email protected]']
>>> print(test_list)
['Hafst\xc3\xa4tter', '[email protected]']
>>> for item in test_list:
...     print(item)
...     
Hafstätter
[email protected]

But they are equivalent:

>>> 'Hafst\xc3\xa4tter' == 'Hafstätter'
True

If you want to see lists displayed with the non-debugging output, you have to generate it yourself:

>>> print("['"+"', '".join(item for item in test_list) + "']")
['Hafstätter', '[email protected]']

There is a reason for the debugging output:

>>> a = 'a\xcc\x88'
>>> b = '\xc3\xa4'
>>> a
'a\xcc\x88'
>>> print a,b   # should look the same, if not it is the browser's fault :)
ä ä
>>> a==b
False
>>> [a,b]      # In a list you can see the difference by default.
['a\xcc\x88', '\xc3\xa4']

Upvotes: 1

Related Questions