unpix
unpix

Reputation: 853

character encoding from XML to Java

 <?xml version="1.0" encoding="UTF-8"?>

I'll put just some extract of codes, that i think are meaningful.

I'm reading some information from one xml via http request, something like this :

        // defaultHttpClient
        DefaultHttpClient httpClient = new DefaultHttpClient();
        HttpPost httpPost = new HttpPost(url);

        HttpResponse httpResponse = httpClient.execute(httpPost);
        HttpEntity httpEntity = httpResponse.getEntity();
        xml = EntityUtils.toString(httpEntity);

if i print the string xml to the screen i can see some problems with the codification already

then to return a document i have this

        Document doc = null;
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

        DocumentBuilder db = dbf.newDocumentBuilder();

        InputSource is = new InputSource();
        is.setCharacterStream(new StringReader(xml));

        doc = db.parse(is); 

Although i'm fetching correctly the information from http request, i'm having problems with the enconding of the characters when i'm showing the data.

I already tried to do is.setEncoding("UTF-8") but didn't work.

Upvotes: 0

Views: 335

Answers (1)

jtahlborn
jtahlborn

Reputation: 53694

The problem is that you converted the xml to a String (characters), don't do that (you most likely used the wrong encoding and corrupted the xml). treat xml as binary data (bytes).

you could use EntityUtils.toByteArray (okay), or you could pass the HttpEntity stream directly to the xml parser (ideal).

Upvotes: 4

Related Questions