Reputation: 191
On my Paspberry I run a Python script under Python 2.7.
I want to declare some string literals as Latin-1, not as UTF-8.
Therefore I added the statement
# -*- coding: latin-1 -*-
at start of my file. But regardless of which coding I'm use the following code snippet always declares my string as UTF-8.
s = 'äöü'
print '%s %d' %(s, len(s))
print '%x %x %x %x %x %x' % (ord(s[0]), ord(s[1]), ord(s[2]), ord(s[3]), ord(s[4]), ord(s[5]))
Shows me always:
äöü 6
c3 a4 c3 b6 c3 bc
What is the correct way to declare a string literal with Latin-1 coding, i.e. In my case I would expect a string with 3 characters: 0xe4, 0xf6, 0xfc?
Upvotes: 1
Views: 114
Reputation: 120486
'\xe4\xf6\xfc'
is a byte string with the 3 bytes you specified.
As to why it's not encoding properly, that depends on how your editor is saving the file. You might want to check that it's being saved using Latin-1 by looking at your source file with hexdump.
Upvotes: 0
Reputation: 1690
If I understand, you can use:
s.encode('latin-1');
for your problem.
Example:
>>> s = u'ééé'.encode('latin1')
>>> s.decode('latin1')
u'\xe9\xe9\xe9'
Give me feedback if you can do it.
Upvotes: 2