Prem Ananth
Prem Ananth

Reputation: 21

java servlet decoding UTF-8 not working

I am try to decode the encodes string in servlet it is not getting decoded properly.

but when i try same with normal java program its working fine.

my code is as below....

import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class testsetrv extends HttpServlet 
{
    public void doGet(HttpServletRequest req,HttpServletResponse res)
    {
        String sd="நல்வரவு";
        String s="%E0%AE%A4%E0%AE%95%E0%AE%B5%E0%AE%B2%E0%AF%8D%20%E0%AE%B5%E0%AF%86%E0%AE%B1%E0%AF%8D%E0%AE%B1%E0%AE%BF%E0%AE%95%E0%AE%B0%E0%AE%AE%E0%AE%BE%E0%AE%95%20%E0%AE%9A%E0%AF%87%E0%AE%B0%E0%AF%8D%E0%AE%95%E0%AF%8D%E0%AE%95%E0%AE%AA%E0%AF%8D%E0%AE%AA%E0%AE%9F%E0%AF%8D%E0%AE%9F%E0%AE%A4%E0%AF%81";
    try {
        decode(s);
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();}
    }
    public void decode(String a) throws UnsupportedEncodingException
    {
        String s=URLDecoder.decode(a, "UTF-8");
        System.out.println(s);
    }
    public static void main(String a[]) throws UnsupportedEncodingException
    {
        String sd="நல்வரவு";
        String s="%E0%AE%A4%E0%AE%95%E0%AE%B5%E0%AE%B2%E0%AF%8D%20%E0%AE%B5%E0%AF%86%E0%AE%B1%E0%AF%8D%E0%AE%B1%E0%AE%BF%E0%AE%95%E0%AE%B0%E0%AE%AE%E0%AE%BE%E0%AE%95%20%E0%AE%9A%E0%AF%87%E0%AE%B0%E0%AF%8D%E0%AE%95%E0%AF%8D%E0%AE%95%E0%AE%AA%E0%AF%8D%E0%AE%AA%E0%AE%9F%E0%AF%8D%E0%AE%9F%E0%AE%A4%E0%AF%81";
        testsetrv f=new testsetrv();
        f.decode(s);
    }
}

when i try with servlet i'm getting ??????

when i run the main method it shows the result correctly as ....

தகவல் வெற்றிகரமாக சேர்க்கப்பட்டது

in my server.xml i have added

URIEncoding="UTF-8" useBodyEncodingForURI="true"

i have add a filter to handle all request as below

public class UTFFilter implements Filter
 {
 private String encoding;

 public void init(FilterConfig config) throws ServletException
 {
  encoding = config.getInitParameter("requestEncoding");

  if( encoding==null ) encoding="UTF-8";
 }

 public void doFilter(ServletRequest request, ServletResponse response, FilterChain       next)
 throws IOException, ServletException
 {
  // Respect the client-specified character encoding
  // (see HTTP specification section 3.4.1)
     System.out.println("asdasgadsgasdgasdgasdgsadgasgasdgsdgsgdsgsdg");
  if(null == request.getCharacterEncoding())
    request.setCharacterEncoding(encoding);


  /**
* Set the default response content type and encoding
*/
response.setContentType("text/html; charset=UTF-8");
response.setCharacterEncoding("UTF-8");


  next.doFilter(request, response);
 }

  public void destroy(){}
 }

but still not working

Help PLZ!!!!!

Upvotes: 1

Views: 1470

Answers (4)

Santosh
Santosh

Reputation: 17893

Here some observation/comments:

  1. The string that you are trying to decode are not picked up from request and hence any server side encoding related settings will not take any effect.
  2. Because of a native string (String sd="நல்வரவு";) in the source code, the source code is compiled with encoding UTF-8 which used when you run the program as standalone. This will be confirmed when you try to print the output of System.getProperty("file.encoding"). This should print UTF-8.
  3. To solve your problem, the system encoding needs to be set. That be done by setting a JVM parameter file.encoding. This can be done using -Dfile.encoding=UTF-8 in JVM parameter. This can be set in the start up batch file of the app server.

Upvotes: 0

Ravi Trivedi
Ravi Trivedi

Reputation: 2360

Your code is perfectly fine. The problem lies in your console printing. windows console does not support UTF-8 printing.

The main method you are testing must be inside your java editor which supports UTF-8 printing.

Upvotes: 1

Scary Wombat
Scary Wombat

Reputation: 44824

if the logging is just going to log4j then maybe it is working and log4j needs some configuring

as in log4j.appender.rollingFile.encoding=UTF-8

If in my code I have

        String s="%E0%AE%A4%E0%AE%95%E0%AE%B5%E0%AE%B2%E0%AF%8D%20%E0%AE%B5%E0%AF%86%E0%AE%B1%E0%AF%8D%E0%AE%B1%E0%AE%BF%E0%AE%95%E0%AE%B0%E0%AE%AE%E0%AE%BE%E0%AE%95%20%E0%AE%9A%E0%AF%87%E0%AE%B0%E0%AF%8D%E0%AE%95%E0%AF%8D%E0%AE%95%E0%AE%AA%E0%AF%8D%E0%AE%AA%E0%AE%9F%E0%AF%8D%E0%AE%9F%E0%AE%A4%E0%AF%81";
    try {
    String b = URLDecoder.decode(s, "UTF-8");
    log.debug(b);
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }   

Then the string b is correct but the print of it is not. So a SYSOUT issue in my case.

Upvotes: 0

N V Harikrishna
N V Harikrishna

Reputation: 31

I doubt the request is not encoded with UTF-8. Print the character encoding of coming request in your filter. Or you can use tool like fiddler. If it is not UTF-8, that might be the problem.

You are using useBodyEncodingForURI which says:

"This specifies if the encoding specified in contentType should be used for URI query parameters, instead of using the URIEncoding. This setting is present for compatibility with Tomcat 4.1.x, where the encoding specified in the contentType, or explicitly set using Request.setCharacterEncoding method was also used for the parameters from the URL. The default value is false."

In your case, even though you set the URIEncoding to UTF-8 in server.xml, it will be overridden by encoding present in contentType. Unset useBodyEncodingForURI (and if required, fix the request contentType) and see how it is working.

Upvotes: 1

Related Questions