Reputation: 18967
I'm using Notepad++ editor on windows with format set to ASCII, I've read "PEP 263: Source Code Encodings" and amended my code accordingly (I think), but there are characters still printing in hex...
#!/usr/bin/python
# -*- coding: UTF-8 -*-
import os, sys
a_munge = [ "A", "4", "/\\", "\@", "/-\\", "^", "aye", "?" ]
b_munge = [ "B", "8", "13", "I3", "|3" , "P>", "|:", "!3", "(3", "/3", "3","]3" ]
c_munge = [ "C", "<", "(", "{", "(c)" ]
d_munge = [ "D", "|)", "|o", "?", "])", "[)", "I>", "|>", " ?", "T)", "0", "cl" ]
e_munge = [ "E", "3", "&", "€", "£", "[-", "|=-", "?" ]
.
.
.
Upvotes: 0
Views: 761
Reputation: 82924
print some_list
is in effect print repr(some_list)
-- that's why you see \u20ac
instead of a Euro character. For debugging purposes, the "unicode hex" is exactly what you need for unambiguous display of your data.
You appear to have perfectly OK unicode objects in your list; I suggest that you don't "print" the list to Tkinter.
Upvotes: 1
Reputation: 177461
The line:
# -*- coding: UTF-8 -*-
declares that the source file is saved in UTF-8. Anything else is an error.
When you declare byte strings in your source code:
e_munge = [ "E", "3", "&", "€", "£", "[-", "|=-", "?" ]
then byte strings like "€" will actually contain the encoded bytes used to save the source file.
When you use Unicode strings instead:
e_munge = [ u"E", u"3", u"&", u"€", u"£", u"[-", u"|=-", u"?" ]
then when u followed by the byte-string "€" is read by Python from a source file, it uses the declared encoding to decode that character into Unicode.
An illustration:
# coding: utf-8
bs = '€'
us = u'€'
print repr(bs)
print repr(us)
OUTPUT:
'\xe2\x82\xac'
u'\u20ac'
Upvotes: 2
Reputation: 798486
Perhaps you should be using unicode literals (e.g. u'€'
) instead.
Upvotes: 2