Reputation: 821
This is Python 2.7. Don't judge me. :)
I have a Django application where the class file is being used for a bit of auditing. For historical reasons I have a __str__
method in this class, and I am trying to return something useful.
def __str__(self):
return "%s %s" % (self.guid, self.setside_username)
Now, this failed with non-ascii characters in the setside_username, as the audit log was indirectly calling __str__
like so
log.info("change to %s", obj)
I tried renaming __str__
to __unicode__
but it still failed in the same location. So I tried sanitize the string by ascii encoding it and having the encoder replace anything it didn't understand.
def __str__(self):
return "%s %s" % (self.guid, self.setside_username.encode('ascii', 'replace')
but that line fails with a UnicodeDecodeError, which baffles me because I thought that call would tell the encoder to replace anything it doesn't understand.
So to prove that I don't understand the difference between encode() and decode(), I s/encode/decode and suddenly the error is gone.
And I haven't a clue why. I thought decode created unicode objects and encode created byte strings, so why would decode on a unicode object help here?
Worse, my little test script that simply prints the object using a print statement is now failing!
username = self.setside_username.decode('ascii', 'replace')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)
So I fix one use case but break another.
I need to understand this to know that I'm actually fixing the problem and not just playing whack-a-mole until the backtraces go away.
Help appreciated.
Update: Moving to __unicode__
methods that return unicode.
Still seeing this.
Traceback (most recent call last):
File "/usr/lib64/python2.6/logging/__init__.py", line 784, in emit
msg = self.format(record)
File "/usr/lib64/python2.6/logging/__init__.py", line 662, in format
return fmt.format(record)
File "/usr/lib64/python2.6/logging/__init__.py", line 444, in format
record.message = record.getMessage()
File "/usr/lib64/python2.6/logging/__init__.py", line 314, in getMessage
msg = msg % self.args
File "/etc/e-smith/web/django/teleworker/clients/models.py", line 364, in __unicode__
return u"%s %s" % (self.guid, self.setside_username)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
[03/Aug/2017 12:59:03.907] ERROR [MainThread] [tug-eventd.tug-eventd:1727] Error handling cluster event: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
Upvotes: 1
Views: 942
Reputation: 1123520
You need to give your model a __unicode__
method that actually returns Unicode:
def __unicode__(self):
return u"%s %s" % (self.guid, self.setside_username)
Note the u
prefix, we used a Unicode literal, and we did not encode the username.
The Django model baseclass provides a __str__
method that'll take the __unicode__
output and encode it to a bytestring for you.
Upvotes: 2