Reputation: 132
I am writing an HTTP server, and to test what messes it up, I entered ઔஇᆖ into a text field. The client request is
GET /add_text_data?message=%E0%AA%94%E0%AE%87%E1%86%96&category=log&color=black HTTP/1.1
Host: localhost
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
When I used URLDecodeer.decode("%E0%AA%94%E0%AE%87%E1%86%96", "UTF-8")
, I got ???
. How do I fix this?
Upvotes: 0
Views: 104
Reputation: 132
It turns out that this is actually not an issue with URLDecoder, but with OutputStream. URLDecodeer.decode("%E0%AA%94%E0%AE%87%E1%86%96", "UTF-8").equals("ઔஇᆖ")
is actually true. I just needed to set Eclipse to accept UTF-8 output. This question fixed it for me.
Upvotes: 1
Reputation: 416
Looks like UTF-8 can't handle it.
You can test decoding here to see what kind of decoding you'd have to use.
http://encoder.mattiasgeniar.be/index.php
Make sure you store the result in some kind of datatype that can accept unicode if you do.
Upvotes: 0