Tony
Tony

Reputation: 3638

Java XML encoding error

I've implemented the following code within my Java application

//create string from xml tree
            StringWriter sw = new StringWriter();
            StreamResult result = new StreamResult(sw);
            DOMSource source = new DOMSource(doc);
            trans.transform(source, result);
            String xmlString = sw.toString().replaceAll("&[^;]+?;", ""); //Replace invalid HTML characters

            //print xml
            System.out.println("The XML is:\n\n" + xmlString);

            OutputStream out = new FileOutputStream(dir.getSearchFileOutputDirectory() + "\\" + "output.xml");

            //Write the XML to disk
            out.write(xmlString.getBytes("ISO-8859-1"));
            out.close();

Now, if I run this within Netbeans, the XML file renders perfectly within Chrome, IE and Firefox. However, once I clean and build the code, then run the standalone JAR file, the browsers report encoding errors within the file, and fails to render it.

The thing is the lines they are failing at don't really contain anything out of the ordinary, just standard ASCII characters that I can see.

Can anyone shed light on why this might be happening? I've got to demo the code tomorrow and now I'm getting in a panic as to why its suddenly doing this strange thing...

Any input would be greatly appreciated.

Thanks

Tony

Upvotes: 1

Views: 505

Answers (2)

Michael Kay
Michael Kay

Reputation: 163585

Because you're fiddling around with the file at the level of characters and bytes, rather than using XML-aware tools, encoding is entirely your responsibility. You seem to be making no attempt to ensure that the encoding used to write the output stream is the same as the encoding appearing in the XML declaration, so this kind of failure seems inevitable.

Upvotes: 1

bmargulies
bmargulies

Reputation: 100151

XML isn't HTML. That replace operation is a bad thing. If your style sheet says that the output is HTML, it will work. If it says that the output is XML, then don't try to use it as HTML.

Upvotes: 0

Related Questions