Czar
Czar

Reputation: 366

Strange character encoding issue with Eclipse / Spring / Tomcat 6

I have been trying things all day but can't get a proper solution. My problem is: I am developing a Spring MVC based app in my local Tomcat. My MySQl database has UTF-8 encoding set, all content in there displays properly when using phpMyAdmin. Also the output in LOG files using log4j in catalina.out works fine.

My JSP pages are configured by

<!-- encoding -->
<%@ page contentType="text/html; charset=UTF-8" %>
<%@ page pageEncoding="UTF-8" %>

Also showing data on my JSP works fine. I can also send data from my Controller without any DB intereference using special chars, e.g.

String str = "UTF-8 Test: Ä Ö Ü ß è é â";
logger.debug(str);
mav.addObject("utftest", str);

That displays correctly in log and on jsp page in browser.

BUT: When having special chars directly in my JSP file, e.g. for text in headers, this does not work. FF and Google Chrome display strange chars but report the page to be UTF-8. When switching to Latin, the chars just get more and more strange.

Same problem when showing text tokens from my messages.properties file, although Eclipse says when right-clicking that UTF-8 will be used.

I am a little at lost and don't know where to check now.

Summary:

Any ideas? I really appreciate and tips.

Upvotes: 3

Views: 23957

Answers (5)

mtraut
mtraut

Reputation: 4740

For JSP, see @BalusC.

For properties files see: http://download.oracle.com/javase/1.4.2/docs/api/java/util/Properties.html

When saving properties to a stream or loading them from a stream, the ISO 8859-1 character encoding is used. For characters that cannot be directly represented in this encoding, Unicode escapes are used; however, only a single 'u' character is allowed in an escape sequence. The native2ascii tool can be used to convert property files to and from other character encodings.

Upvotes: 0

Frank Masse
Frank Masse

Reputation: 21

I'm using Tomcat 7 with Spring frameworks and using <jsp:include page="anyFile.html"/> in JSP fail and give me a java.lang.IllegalStateException. The <jsp:include> works fine if i want to include another JSP file instead of a static HTML file though but when I'm trying to inject static HTML file it keep giving me this exception in relation with the Character Encoding.

Using <jsp:directive.include file="anyFile.html" /> or <%@include file="anyFile.html"%> works but all the special character ("é", "è", "ç" etc.) appear coded into ISO-8891 instead of UTF-8 even if the JSP file have the <%@page contentType="text/html" pageEncoding="UTF-8"%> and the <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> in it.

I found the solution by using the JSLT tag library with the import tag:

  1. put this into the JSP: <%@taglib uri="http://java.sun.com/jsp/jstl/core" prefix="c"%>

  2. Then get the HTML file I want to include using this: <c:import url="anyFile.html" charEncoding="UTF-8"/>

Has you can see the import tag from the JSLT library have a charEncoding attribute that can set the html file to the appropriate Character encoding and display it's content correctly.

Upvotes: 2

user684934
user684934

Reputation:

As BalusC said, you must save the files in format utf-8.

To address your additional problem of included files, simply include the header

<%@ page contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%>

at the top of each included file. This tells the servlet to treat the file as UTF-8 encoded, instead of using the default ISO-8859-1.

Upvotes: 4

Doc Davluz
Doc Davluz

Reputation: 4250

The quest

I got exactly the same problem than yours with a very similar configuration (Tomcat, Spring, Spring Web Flow, JSF2).

Little facts about my own investigations:

  • WAR under Tomcat Window: encoding problem,
  • same WAR under Tomcat Linux: no problem → suspect OS default encoding as Linux is in UTF-8,
  • same WAR under Tomcat run by Eclipse WTP on Windows: no problem → WTF?!
  • passing properties files in UTF-8 with natural latin characters instead of unicode placeholders: solve the problem for externalized labels,
  • same in Facelets (JSF2 pages): always get the problem, only thing working is <f:verbatim>&amp;eacute;</f:verbatim>.

Still getting the problem, after having checked all my code for classic prerequisites and recommandations found on forums:

  • <?xml version="1.0" encoding="UTF-8" ?> at top of XML files,
  • <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> inside HTML header of same files,
  • encoding="UTF-8" in <f:view>.

The configuration of Tomcat in the following ways did nothing:

  • URIEncoding="UTF-8" on connector in server.xml (normal because it concerns URI encoding not page encoding)
  • org.springframework.web.filter.CharacterEncodingFilter on and off,
  • also that (I presumably miss the point here):

    <locale-encoding-mapping-list>
      <locale-encoding-mapping>
        <locale>fr</locale>
        <encoding>UTF-8</encoding>
      </locale-encoding-mapping>
    </locale-encoding-mapping-list>
    

The key

I found the solution comparing the Tomcat command line between WTP and classic command-line MS-DOS Tomcat launch. The only difference is the parameter -Dfile.encoding=UTF-8. It was the key for me to solve the problem.

Set JAVA_OPTS=-Dfile.encoding="UTF-8" and it works fine.

The (attempted) explanation

The only explanation I found, Tomcat use JVM encoding which is by default the system encoding (UTF-8 on Linux, CP1252 on Windows). Eclipse WTP force the JVM encoding according to its workspace encoding settings. Passing JVM in UTF-8 gives the solution.

I suspect it's not really the right one and that there is a configuration problem either on my stack or on resources filtering made either by maven-resources-plugin or maven-war-plugin, but I haven't found it yet.

Upvotes: 6

BalusC
BalusC

Reputation: 1108782

You need to configure Eclipse to save the files as UTF-8.

Go to Window > Preferences, enter filter text encoding in top, explore all sections to set everything to UTF-8. Specifically for JSP files this is in Web > JSP Files > Encoding. Choose the topmost UTF-8 option (called "ISO 10646/Unicode(UTF-8)").

For properties files this is a story apart. As per the specification, they will by default be read as ISO-8859-1. You need either native2ascii tool for this or supply a custom properfies file loader which uses UTF-8. For more detail, see this article.

Upvotes: 3

Related Questions