Austin
Austin

Reputation: 680

Tomcat Character Encoding working differently between server and local development

I'm having an issue with character encoding working differently in my development environment (NetBeans and local Tomcat installation) versus our server. We're using Tomcat for a server-side servlet and a Java client.

On the server side, this code works locally on my machine:

protected void doPost(HttpServletRequest request, HttpServletResponse response) {
...
    java.util.zip.InflaterInputStream zipIn = new java.util.zip.InflaterInputStream(request.getInputStream());
    BufferedReader in = new BufferedReader(new InputStreamReader(zipIn, "UTF-8"));
    String line = in.readLine(); // correctly encoded String
...
}

However, on the actual server, specifying the character set breaks the code, and will only work like this:

protected void doPost(HttpServletRequest request, HttpServletResponse response) {
...
    java.util.zip.InflaterInputStream zipIn = new java.util.zip.InflaterInputStream(request.getInputStream());
    BufferedReader in = new BufferedReader(new InputStreamReader(zipIn));
    String line = in.readLine(); // correctly encoded String
...
}

I've tried different versions of Tomcat (7 and 8) and different versions of Java (7 and 8), I've also tried specifying the character set in the Tomcat connector (URIEncoding) and even as a JVM argument, but none of that seems to make a difference.

When the above code executes I've checked the default character set, it's windows-1252, which is why I was specifying UTF-8 in the InputStreamReader constructor, no idea how this works on our server. The request.getCharacterEncoding() also returns utf-8.

Does anyone have any ideas? Thanks in advance for any help.

Upvotes: 2

Views: 2668

Answers (3)

emi-le
emi-le

Reputation: 796

Also for POST-requests, the web.xml needs to be adjusted and encoding filters need to be included (like explained in How to get UTF-8 working in Java webapps?).

Also in most tomcat 7+ versions, the needed filters are already included and need only be activated by uncommenting the following lines:

1.

<filter>
    <filter-name>setCharacterEncodingFilter</filter-name>
    <filter-class>org.apache.catalina.filters.SetCharacterEncodingFilter</filter-class>
    <init-param>
        <param-name>encoding</param-name>
        <param-value>UTF-8</param-value>
    </init-param>
    <async-supported>true</async-supported>
</filter>

2.

<filter-mapping>
    <filter-name>setCharacterEncodingFilter</filter-name>
    <url-pattern>/*</url-pattern>
</filter-mapping>

When using tomcat inside of eclipse, make sure to do the adjustments in the /Servers folder not the installation directory. Eclipse has copies of all configuration files inside the workspace folders.

Upvotes: 0

Austin
Austin

Reputation: 680

Looks like I needed to start the JVM with "-Dfile.encoding=UTF-8", that did the trick. I think the actual String object was still being encoded as the windows default so once read from the stream, the encoding got messed up. The String still printed fine to the console but when I checked the Unicode point, there were the wrong characters.

Upvotes: 2

Joop Eggen
Joop Eggen

Reputation: 109567

As one can set the encoding of both request and response, my guess is that the response is missing a

response.setEncoding("UTF-8");

and hence the HTTP default encoding ISO-8859-1 (Latin-1) is used, which is a somewhat a subset of Windows-1252 (Windows Latin-1).

That is somehow 2 errors that cancelled each other for Windows-1252.

But check that the compressed text indeed is UTF-8.

Upvotes: 1

Related Questions