Nitzan Tomer
Nitzan Tomer

Reputation: 164297

Error parsing application/x-www-form-urlencoded with Unicode post data

Play refuses to accept a POST request when the data is unicode and I get:

Error parsing application/x-www-form-urlencoded

I was under the impression that everything is working great until I tried a request with text in Hebrew instead of English, so a request with

value=hey

works fine but a request with

value=%u05D4%u05D9%u05D9

fails.

I found something about it but he said he made it worked by changing play/api/mvc/ContentType.scala, something I'd like to avoid.

Any ideas?
Thanks!


Edit

I'm aware that the encoding does not fit the standards for application/x-www-form-urlencoded but that's the case I need to deal with, changing the client side currently is not an option and it uses the javascript escape method.

I'm looking for a solution on the backend side of things, that is a Play solution.
It would be nice to find a solution which can be implemented in java, but for now it looks like the solution is to write my own BodyParser (in scala).

Upvotes: 1

Views: 2268

Answers (1)

Stephen C
Stephen C

Reputation: 719238

According to my research, the correct way to handle Unicode in a application/x-www-form-urlencoded body is to translate the Unicode to bytes in the document's default charset (i.e. UTF-8) and then URL-encode the bytes (i.e. %-encode).

Certainly what you are currently doing (with '%uxxxx' sequences) is not a valid encoding as far as the specifications are concerned. (You can't just pull stuff out of the air like that ... and expect it to work.)

References:


I note that you discovered this escape syntax via your browser's console. Here's what the MSDN says about the Javascript escape() method:

"The escape and unescape functions do not work properly for non-ASCII characters and have been deprecated. In JavaScript 1.5 and later, use encodeURI, decodeURI, encodeURIComponent, and decodeURIComponent."

I think that "do not work properly" means that they use a non-standard escaping syntax that browsers don't recognize. Lesson: read the spec rather than relying on experiments.

Upvotes: 1

Related Questions