Reputation: 221
Why is it, that the encoding changes in Python 2.7 when I iterate over the items of a list?
test_list = ['Hafst\xc3\xa4tter', '[email protected]']
Printing the list:
print(test_list)
gets me this output:
['Hafst\xc3\xa4tter', '[email protected]']
So far, so good. But why is it, that when I iterate over the list, such as:
for item in test_list:
print(item)
I get this output:
Hafstätter
[email protected]
Why does the encoding change (does it?? And how can I change the encoding within the list?
Upvotes: 0
Views: 96
Reputation: 177575
The encoding isn't changing, they are just different ways of displaying a string. One shows the non-ASCII bytes as escape codes for debugging:
>>> test_list = ['Hafst\xc3\xa4tter', '[email protected]']
>>> print(test_list)
['Hafst\xc3\xa4tter', '[email protected]']
>>> for item in test_list:
... print(item)
...
Hafstätter
[email protected]
But they are equivalent:
>>> 'Hafst\xc3\xa4tter' == 'Hafstätter'
True
If you want to see lists displayed with the non-debugging output, you have to generate it yourself:
>>> print("['"+"', '".join(item for item in test_list) + "']")
['Hafstätter', '[email protected]']
There is a reason for the debugging output:
>>> a = 'a\xcc\x88'
>>> b = '\xc3\xa4'
>>> a
'a\xcc\x88'
>>> print a,b # should look the same, if not it is the browser's fault :)
ä ä
>>> a==b
False
>>> [a,b] # In a list you can see the difference by default.
['a\xcc\x88', '\xc3\xa4']
Upvotes: 1