Reputation: 43
I thought, I understood unicode and python. But this issue confuses me a lot. Look at this small test program:
# -*- coding: utf-8 -*-
class TestC(object):
def __str__(self):
return u'äöü'
import sys
print sys.version
print sys.stdin.encoding
print sys.stdout.encoding
print u'öäü' #this works
x = TestC()
print x #this doesn't always work
When I run this from my bash terminal on ubuntu, I get the following result:
2.7.3 (default, Aug 1 2012, 05:14:39)
[GCC 4.6.3]
utf-8
utf-8
öäü
Traceback (most recent call last):
File "test_mod.py", line 14, in <module>
print x #this doesn't '
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
However, when I run the same thing from within eclipse (using the pydev module), both print statements work flawlessly. The console windows says:
2.7.3 (default, Aug 1 2012, 05:14:39)
[GCC 4.6.3]
utf-8
utf-8
öäü
äöü
Can someone please explain to me what the issue is? Why does the __str__ method work in one case but not in the other? What is the best way to fix this?
Upvotes: 3
Views: 1841
Reputation: 15944
See this related question: Python __str__ versus __unicode__
Basically, you should probably be implementing the special method __unicode__
rather than __str__
, and add a stub __str__
that calls __unicode__
:
def __str__(self):
return unicode(self).encode('utf-8')
Upvotes: 7