Parand
Parand

Reputation: 106390

Django: dealing with non-ascii parameters

I'm running into an issue dealing with non-ascii POST parameters. Here's a CURL request that shows the problem:

curl "http://localhost:8000/api/txt/" -d \
"sender=joe&comments=Bus%20%A3963.33%20London%20to%20Sydney"

The pound sign in comments is causing the issue: when I try to do just about anything with request.POST['comments'] I get:

UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 4: ordinal not in range(128)

For example, if I just try to log what comments is:

message = request.POST.get('comments', None)
file('/tmp/comments.txt', 'wb').write(message)

I get the above error. Or when I try to decode it, I get the same error:

try:
    message = message.decode('ISO-8859-2','ignore').encode('utf-8','ignore')
except Exception, e:
    file('/tmp/ERROR-decode.txt','w').write(str(e))

produces ERROR-decode.txt with:

'ascii' codec can't encode character u'\ufffd' in position 4: ordinal not in range(128)

Ideas?

Upvotes: 2

Views: 2988

Answers (2)

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 799580

%A3 is wrong. It should in fact be %C2%A3 or %C5%81 in order to be correct UTF-8.

Also, "Unicode In Python, Completely Demystified".

Upvotes: 2

Stefano Borini
Stefano Borini

Reputation: 143935

I think you have to pass it first into urllib.unquote() to remove the quoting performed by HTTP, then, you can convert the string to unicode with the proper encoding

>>> unicode(urllib.unquote("Bus%20%A3963.33%20London%20to%20Sydney"), \
            "iso-8859-2").encode("utf-8") 
'Bus \xc5\x81963.33 London to Sydney'

Upvotes: 0

Related Questions