Reputation: 3638
I've implemented the following code within my Java application
//create string from xml tree
StringWriter sw = new StringWriter();
StreamResult result = new StreamResult(sw);
DOMSource source = new DOMSource(doc);
trans.transform(source, result);
String xmlString = sw.toString().replaceAll("&[^;]+?;", ""); //Replace invalid HTML characters
//print xml
System.out.println("The XML is:\n\n" + xmlString);
OutputStream out = new FileOutputStream(dir.getSearchFileOutputDirectory() + "\\" + "output.xml");
//Write the XML to disk
out.write(xmlString.getBytes("ISO-8859-1"));
out.close();
Now, if I run this within Netbeans, the XML file renders perfectly within Chrome, IE and Firefox. However, once I clean and build the code, then run the standalone JAR file, the browsers report encoding errors within the file, and fails to render it.
The thing is the lines they are failing at don't really contain anything out of the ordinary, just standard ASCII characters that I can see.
Can anyone shed light on why this might be happening? I've got to demo the code tomorrow and now I'm getting in a panic as to why its suddenly doing this strange thing...
Any input would be greatly appreciated.
Thanks
Tony
Upvotes: 1
Views: 505
Reputation: 163585
Because you're fiddling around with the file at the level of characters and bytes, rather than using XML-aware tools, encoding is entirely your responsibility. You seem to be making no attempt to ensure that the encoding used to write the output stream is the same as the encoding appearing in the XML declaration, so this kind of failure seems inevitable.
Upvotes: 1
Reputation: 100151
XML isn't HTML. That replace operation is a bad thing. If your style sheet says that the output is HTML, it will work. If it says that the output is XML, then don't try to use it as HTML.
Upvotes: 0