Reputation: 2890
Some has been using my Tornado application and making POST requests which contain this character: ¡
Tornado was unable to decode the value and ended up with this error: HTTP 400: Bad Request (Invalid unicode in PARAMNAME: b'DATAHERE')
So I made some investigation and learned that In request body, I was receiving %A1
for the corresponding character, which python's decode
method had no difficulty to decode for utf-8
encoding.
But, after URL-decoding this value, Tornado ended up with \xa1
for the character and tried to decode this using utf-8 and failed, because this was actually ISO-8859-1 encoding.
So, what should be the appropriate way to fix this? Because user is sending valid output I don't want to loose this data.
Upvotes: 1
Views: 1067
Reputation: 22134
The best answer is to make sure the client always sends utf8 instead of iso8859-1 (this used to require weird tricks like the rails snowman; I'm not sure about the current state of the art). If you cannot do that, override RequestHandler.decode_argument (http://www.tornadoweb.org/en/stable/web.html#tornado.web.RequestHandler.decode_argument), which can see the raw bytes and decide how to decode them (or pass them through unchanged if you don't want to decode at this point).
Upvotes: 1