vaibhav
vaibhav

Reputation: 4103

Charset filter causing issue in parsing UTF-8 characters

I am using Spring MVC's charset filter. This is the URL that I use to invoke my servlet from my applet

http://192.168.0.67/MyServlet?p1=団

As you can see, the parameter has a unicode character 団. So I use

URLEncoder.encode("団", "UTF-8"); 

and now my URL becomes

http://192.168.0.67/MyServlet?p1=%E5%9B%A3

However, from the servlet, calling

request.getParameter("p1"); 

already return some gibberish that cannot be decoded with URLDecoder. BTW, invoking

URLDecoder.decode("%E5%9B%A3", "UTF-8"); 

does give me the original unicode character. It's just that the servlet has garbled the parameter before it can even be decoded. Does anyone know why? request.getParameter() doesn't decode parameter with UTF-8?

Upvotes: 4

Views: 3682

Answers (1)

BalusC
BalusC

Reputation: 1108732

The Spring MVC's charset filter will only set the request body encoding, not the request URI encoding. You need to set the charset for the URI encoding in the servletcontainer configuration. Lot of servletcontainers default to ISO-8859-1 to decode the URI. It's unclear what servletcontainer you're using, so here's just an example for Tomcat: edit the <Connector> entry of /conf/server.xml to add URIEncoding="UTF-8":

<Connector ... URIEncoding="UTF-8">

If you can't edit the server's configuration for some reason (e.g. 3rd party hosting and such), then you should consider to use POST instead of GET:

String query = "p1=" + URLEncoder.encode("団", "UTF-8");
URLConnection connection = new URL(getCodeBase(), "MyServlet").openConnection();
connection.setDoOutput(true); // This sets request method to POST.
connection.getOutputStream().write(query.getBytes("UTF-8"));
// ...

This way you can in doPost() use ServletRequest#setCharacterEncoding() to tell the Servlet API what charset to use to parse the request body (or just rely on the Spring MVC's charset filter from doing this job):

request.setCharacterEncoding("UTF-8");
String p1 = request.getParameter("p1"); // You don't need to decode yourself!
// ...

See also:

Upvotes: 6

Related Questions