Shafiul
Shafiul

Reputation: 2890

Tornado decoding post arugment fails with UnicodeDecodeError

Some has been using my Tornado application and making POST requests which contain this character: ¡

Tornado was unable to decode the value and ended up with this error: HTTP 400: Bad Request (Invalid unicode in PARAMNAME: b'DATAHERE')

So I made some investigation and learned that In request body, I was receiving %A1 for the corresponding character, which python's decode method had no difficulty to decode for utf-8 encoding.

But, after URL-decoding this value, Tornado ended up with \xa1 for the character and tried to decode this using utf-8 and failed, because this was actually ISO-8859-1 encoding.

So, what should be the appropriate way to fix this? Because user is sending valid output I don't want to loose this data.

Upvotes: 1

Views: 1067

Answers (1)

Ben Darnell
Ben Darnell

Reputation: 22134

The best answer is to make sure the client always sends utf8 instead of iso8859-1 (this used to require weird tricks like the rails snowman; I'm not sure about the current state of the art). If you cannot do that, override RequestHandler.decode_argument (http://www.tornadoweb.org/en/stable/web.html#tornado.web.RequestHandler.decode_argument), which can see the raw bytes and decide how to decode them (or pass them through unchanged if you don't want to decode at this point).

Upvotes: 1

Related Questions