The __str__ method returning a unicode string works in one environment but fails in another

Question

I thought, I understood unicode and python. But this issue confuses me a lot. Look at this small test program:

# -*- coding: utf-8 -*-

class TestC(object):

    def __str__(self):
        return u'äöü'

import sys
print sys.version
print sys.stdin.encoding
print sys.stdout.encoding    
print u'öäü' #this works
x = TestC()
print x #this doesn't always work

When I run this from my bash terminal on ubuntu, I get the following result:

2.7.3 (default, Aug  1 2012, 05:14:39) 
[GCC 4.6.3]
utf-8
utf-8
öäü
Traceback (most recent call last):
  File "test_mod.py", line 14, in 
    print x #this doesn't '
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

However, when I run the same thing from within eclipse (using the pydev module), both print statements work flawlessly. The console windows says:

2.7.3 (default, Aug  1 2012, 05:14:39) 
[GCC 4.6.3]
utf-8
utf-8
öäü
äöü

Can someone please explain to me what the issue is? Why does the __str__ method work in one case but not in the other? What is the best way to fix this?

Edward Loper · Accepted Answer

See this related question: Python __str__ versus __unicode__

Basically, you should probably be implementing the special method __unicode__ rather than __str__, and add a stub __str__ that calls __unicode__:

def __str__(self):
    return unicode(self).encode('utf-8')

The str method returning a unicode string works in one environment but fails in another

Answers (1)

Related Questions