Jason S
Jason S

Reputation: 189796

Java + unicode + HttpServletResponse = fail

I'm trying to make sure my data path -- a Tomcat servlet getting data into/out of MySQL database via JDBC -- handles Unicode directly.

I've been able to verify that I can read/write Unicode from the database. (When I debug Tomcat in Eclipse, I see the result retrieved from the database correctly.) But when I point my browser at my Tomcat servlet, a string like "García" (=Garci{U+0301}a) turns into "Garci?a" in the browser.

I'm using this code fragment to initialize the XML output (request and response are , which uses XMLStreamWriter, and I declare the result as UTF-8:

final protected HttpServletRequest request;
final protected HttpServletResponse response;
   ...

boolean handleRefreshMetadata()
{
    String s = request.getParameter("ids");
    Integer id = Integer.parseInt(s);
    boolean b = refreshMetadata(id); 
    response.setContentType("text/xml");
    try {
        PrintWriter writer = response.getWriter();
        XMLOutputFactory factory = XMLOutputFactory.newInstance();
        XMLStreamWriter xmlwriter = factory.createXMLStreamWriter(writer);      

        xmlwriter.writeStartDocument("UTF-8", "1.0");
        xmlwriter.writeStartElement("response");
        xmlwriter.writeAttribute("success", b ? "true" : "false");
        if (b && (id != null))
        {
            loadArticleFromID(getConnection(), xmlwriter, id);
        }
        xmlwriter.writeEndDocument();
        xmlwriter.flush();
        xmlwriter.close();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (XMLStreamException e) {
        e.printStackTrace();
    }
    catch (SQLException e) {
        e.printStackTrace();
    }
    return b;
}

Am I missing something?

Upvotes: 1

Views: 903

Answers (2)

Muhammad Nuruddin
Muhammad Nuruddin

Reputation: 1

Your content is not unicode encoded. Encode the response content something like below:

final javax.servlet.http.HttpServletResponse resp = (HttpServletResponse)ctx.getExternalContext().getResponse();

byte[] k =xml.getBytes(UTF8_CHARSET); // xml is the string with unicode content

resp.setContentType("text/xml");
resp.setContentLength(k.length);
resp.getOutputStream().write(k);
resp.getOutputStream().flush();
resp.getOutputStream().close();

Upvotes: 0

Jason S
Jason S

Reputation: 189796

Darnit, I figured it out:

instead of

response.setContentType("text/xml");

I need to do:

response.setContentType("text/xml; charset=utf-8");

Upvotes: 4

Related Questions