user379151
user379151

Reputation: 1379

ISO to UTF-8 on RESTful server

I have a RESTful service that expects a string in the request. When the string is passed from the browser, the accented characters are garbled(�), as the default browser encoding is ISO-8859-1. If I change browser encoding to UTF-8, accented characters are preserved in the request string.

Is there a way to change the string encoding and re-construct the string in UTF-8 on the server side so that I need not change the browser encoding everytime ?

Thanks

Upvotes: 0

Views: 2020

Answers (2)

drobert
drobert

Reputation: 1310

I've found that most browsers' default encodings depend on the system they're installed on. Most of mine (especially on Windows) default to either ISO-8859-1 or CP1252, which corresponds with this original post. Make sure that your http headers and html meta tags specify UTF-8 encoding, and ensure your servlet container is set to use UTF-8 by default (see http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q8 if you're using tomcat).

Sometimes you will still get bit by copy-paste from an application using (e.g.) CP1252 being pasted bit-for-bit into a textarea on a UTF-8 page. I have never gotten this to work without garbled characters.

Upvotes: 1

1218985
1218985

Reputation: 8012

UTF-8 encoding standard is capable of encoding any Unicode code point. ISO-8859-1 can handle only a tiny fraction of them. So, transcoding from ISO-8859-1 to UTF-8 is not a problem. Going backwards from UTF-8 to ISO-8859-1 will cause "replacement characters" (�) to appear in your text when unsupported characters are found. To transcode your test, you can do like this:

byte[] utf8 = new String(latin1, "ISO-8859-1").getBytes("UTF-8");

OR

byte[] latin1 = new String(utf8, "UTF-8").getBytes("ISO-8859-1");

Upvotes: 0

Related Questions