Reputation: 43
I am updating a hobby app, written in Python 2.7 on Ubuntu 14.04 that stores railway history data in json. I used it upto now to work on british data.
When starting with french data I encountered a problem which puzzles me. I have a class CompaniesCache
which implements __str__()
. Inside that implementation everything is using str's. Let's say I instantiate a CompaniesCache
and assign into a variable companies
. When I, in IPython2, give the command print companies
, I get an error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in position 184: ordinal not in range(128)".
Alright, that is not strange. Testing. str(companies)
reproduces the error, as expected. But, companies.__str__()
succeeds without problems, as does print company.__str__()
. What is wrong here ?
Here the code of the __str__ method of the CompaniesCache object:
class CompaniesCache(object):
def __init__(self, railrefdatapath):
self.cache = restoreCompanies(railrefdatapath)
def __getitem__(self, compcode):
return self.cache[compcode.upper()]
def __str__(self):
s = ''
for k in sorted(self.cache.keys()):
s += '\n%s: %s' % (k, self[k].title)
return s
This is the code for the CompaniesCache object, which contains Company objects in its cache dict. The Company object does not implement the __str__() method.
Upvotes: 4
Views: 372
Reputation: 281997
str
doesn't just call __str__
. Among other things, it validates the return type, it falls back to __repr__
if __str__
isn't available, and it tries to convert unicode
return values to str
with the ASCII codec.
Your __str__
method is returning a unicode
instance with non-ASCII characters. When str
tries to convert that to a bytestring, it fails, producing the error you're seeing.
Don't return a unicode
object from __str__
. You can implement a __unicode__
method to define how unicode(your_object)
behaves, and return an appropriately-encoded bytestring from __str__
.
Upvotes: 4
Reputation: 2818
Using maxpolk answer I think all you should do is setup your environment variable to
export LC_ALL='en_US.utf8'
All and all I think you can find your answer in this post
Upvotes: 0