Madhusudanan K K C
Madhusudanan K K C

Reputation: 230

UTF-8 issue inspite of URIEncoding="UTF-8"

Hi i was trying to make one of the application i am working on UTF-8 compatible. My env is as follows : linux os , apahce web server as http listener , tomcat as servlet engine

apache s configured with mod_jk and tomcat uses a ajp connector.

I have read the basic guidelines for UTF-8 from few site and based on the recommendations i have tried the following

set URIEncoding="UTF-8" and useBodyEncodingForURI =true for connector in server.xml

Set the language in bashrc/.profile using LANG =en_US.UTF8

Configure apache server to use utf-8 encoding by default i.e specify utf-8 as default char set in

AddDefaultCharset utf-8

Set utf-8 as java args while starting tomcat. using

JAVA_OPTS="-Djavax.servlet.request.encoding=UTF-8 -Dfile.encoding=UTF-8"

i also verified that my webpages has proper meta tags configured as

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

Inspite of all this i am having difficulty reading UTF-8 characters. Any idea where am i going wrong?

An interesting observation is I am facing difficulty only with internet exporer and chorme. When i use fire fox for sending utf-8 characters to my server, i am able to read them correctly. However, the characters are getting mangled for IE and chorme. Any idea where the issue could be?

The only change which i could notcie between chrome and Fire fox is that in contentType header.

the request header for requests from firefox is as follows

Content-Type: application/x-www-form-urlencoded; charset=utf-8 

Where as for chrome (and possible IE as well,which i did not check) is

Content-Type: application/x-www-form-urlencoded 

Any idea whats going wrong here?

Upvotes: 0

Views: 4809

Answers (1)

Madhusudanan K K C
Madhusudanan K K C

Reputation: 230

Allrite, Finally figured out the issue. The below link and list of bugs reported at bottom were very useful to understand the circus which was going on around :

http://wiki.apache.org/tomcat/FAQ/CharacterEncoding

Basically one of my filter was trying to read the request parameter, and the one need to do request. setCharacterEncoding(desiredEncoding) before reading the query params.

so , i tried adding SetCharacterEncodingFilter which sets the char enconding, apparently this too did not work because this filter is available in tomcat 7 (not sure though) onwards and I was on tomcat6.0.x.

So had to write my own filter which sets the charEncoding correctly.

With that, now i am able to get all those managed characters out of my head. they had been bother me too much from yest night ..

Upvotes: 1

Related Questions