Sam
Sam

Reputation: 407

convert unicode string for display object with __repr__ on terminal

I would like to convert the string u'Eichst\xe4tt-Landershofen' for printing the object station on terminal.

import json

class Station(object):
    def __init__(self,id, name, latitude, longitude):
        self._id = id
        self._name = name
        self._latitude = latitude
        self._longitude = longitude
        ....
    def get_name(self):
        return self._name

    def __repr__(self):
        return '<object=%s - id=%s, name=%s, latitude=%s, longitude=%s>' \
        % (self.__class__.__name__, self._id, self._name, self._latitude,\
            self._longitude)

If I call the get_name() function of the object station everything is fine. But, If I try to print the whole object with the function __repr__ I get the following error:

print station.Station(id, name, lat, long) 
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 38: ordinal not in range(128)

The string u'Eichst\xe4tt-Landershofen' was reading by a file with encoding='ISO-8859-1'.

Upvotes: 2

Views: 625

Answers (1)

wim
wim

Reputation: 363253

Firstly, I would like to recommend not to use __repr__ for this in the first place - it's not really intended to be a human-readable representation of the object. For that you should be looking to __str__, __format__, and/or __unicode__.

Now, your issue is that __repr__ is returning a unicode object. This is because when you use a string substitution '<name %s>' % _name and _name is bound to a unicode object, python 2 automatically "promotes" the bytestring template to a unicode in order to achieve the substitution.

Now, upon seeing a unicode object returned from repr, python will try to get a bytes object back by encoding it using sys.getdefaultencoding(), which is evidently 'ascii', and fails because the station can't be encoded using an ascii character set.

If you absolutely want the non-ascii characters in your repr (why??) you will have to choose an encoding which your terminal understands, and encode to that character set. Here is an example with utf-8 which will probably work on your system:

import json

class Station(object):
    def __init__(self,id, name, latitude, longitude):
        self._id = id
        self._name = name
        self._latitude = latitude
        self._longitude = longitude

    def get_name(self):
        return self._name

    def __unicode__(self):
        return u'<object={} - id={}, name={}, latitude={}, longitude={}>'.format(
            self.__class__.__name__, 
            self._id, 
            self.get_name(), 
            self._latitude,
            self._longitude,
        )

    def __repr__(self):
        return unicode(self).encode('utf8')

Upvotes: 3

Related Questions