UnicodeDecodeError when using a Python string handling function

Question

I'm doing this:

word.rstrip(s)

Where word and s are strings containing unicode characters.

I'm getting this:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)

There's a bug report where this error happens on some Windows Django systems. However, my situation seems unrelated to that case.

What could be the problem?

EDIT: The code is like this:

def Strip(word):
    for s in suffixes:
        return word.rstrip(s)

lvc · Accepted Answer

The issue is that s is a bytestring, while word is a unicode string - so, Python tries to turn s into a unicode string so that the rstrip makes sense. The issue is, it assumes s is encoded in ASCII, which it clearly isn't (since it contains a character outside the ASCII range).

So, since you intitialise it as a literal, it is very easy to turn it into a unicode string by putting a u in front of it:

suffixes = [u'ি']

Will work. As you add more suffixes, you'll need the u in front of all of them individually.

UnicodeDecodeError when using a Python string handling function

Answers (2)

Related Questions