Michiel Rademakers
Michiel Rademakers

Reputation: 43

python str() function result differs from __str__() function's result

I am updating a hobby app, written in Python 2.7 on Ubuntu 14.04 that stores railway history data in json. I used it upto now to work on british data.

When starting with french data I encountered a problem which puzzles me. I have a class CompaniesCache which implements __str__(). Inside that implementation everything is using str's. Let's say I instantiate a CompaniesCache and assign into a variable companies. When I, in IPython2, give the command print companies, I get an error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in position 184: ordinal not in range(128)".

Alright, that is not strange. Testing. str(companies) reproduces the error, as expected. But, companies.__str__() succeeds without problems, as does print company.__str__(). What is wrong here ?

Here the code of the __str__ method of the CompaniesCache object:

class CompaniesCache(object):                                                       
    def __init__(self, railrefdatapath):       
        self.cache = restoreCompanies(railrefdatapath)                                             

    def __getitem__(self, compcode):                                                                                     
        return self.cache[compcode.upper()]                                                                              

    def __str__(self):                                                                
        s = ''                                                                            
        for k in sorted(self.cache.keys()):                                                                              
            s += '\n%s: %s' % (k, self[k].title)                                                                
        return s

This is the code for the CompaniesCache object, which contains Company objects in its cache dict. The Company object does not implement the __str__() method.

Upvotes: 4

Views: 372

Answers (2)

user2357112
user2357112

Reputation: 281997

str doesn't just call __str__. Among other things, it validates the return type, it falls back to __repr__ if __str__ isn't available, and it tries to convert unicode return values to str with the ASCII codec.

Your __str__ method is returning a unicode instance with non-ASCII characters. When str tries to convert that to a bytestring, it fails, producing the error you're seeing.

Don't return a unicode object from __str__. You can implement a __unicode__ method to define how unicode(your_object) behaves, and return an appropriately-encoded bytestring from __str__.

Upvotes: 4

Yonatan Kiron
Yonatan Kiron

Reputation: 2818

Using maxpolk answer I think all you should do is setup your environment variable to

export LC_ALL='en_US.utf8'

All and all I think you can find your answer in this post

Upvotes: 0

Related Questions