Reputation: 317
I have some trouble with the JDOM2 whitch i use to work with XML files. I want to convert the XML file to a string without any manipulation or optimization.
Thats my Java code to do that:
SAXBuilder builder = new SAXBuilder();
File xmlFile = f;
try
{
Document document = (Document) builder.build(xmlFile);
xml = new XMLOutputter().outputString(document);
} catch (Exception e) {
System.out.println(e.getMessage());
}
return xml;
But when I compare my string with the original XML file I notice some changes.
The original:
<?xml version="1.0" encoding="windows-1252"?>
<xmi:XMI xmi:version="2.1" xmlns:uml="http://schema.omg.org/spec/UML/2.0" xmlns:xmi="http://schema.omg.org/spec/XMI/2.1" xmlns:thecustomprofile="http://www.sparxsystems.com/profiles/thecustomprofile/1.0" xmlns:SoaML="http://www.sparxsystems.com/profiles/SoaML/1.0">
And the string:
<?xml version="1.0" encoding="UTF-8"?>
<xmi:XMI xmlns:xmi="http://schema.omg.org/spec/XMI/2.1" xmlns:SoaML="http://www.sparxsystems.com/profiles/SoaML/1.0" xmlns:thecustomprofile="http://www.sparxsystems.com/profiles/thecustomprofile/1.0" xmlns:uml="http://schema.omg.org/spec/UML/2.0" xmi:version="2.1">
And all umlauts (ä, ö , ü) are changed too. I will get something like that: '�' instead of 'ä'.
Is there any way to stop that behaviore?
Upvotes: 1
Views: 18167
Reputation: 44292
Firstly, as others have stated, you shouldn't use any XML processing. Just read the file as a text file.
Secondly, your umlaut characters showing up as '�' is due to an incorrect charset (encoding) being used. The charset error may be in your code, or it may be the XML file.
The original XML file contains encoding="windows-1252"
, but it's unusual for XML to be encoded in anything other than UTF-8, so I suspect the file is really a UTF-8 file and the encoding it claims to use is not correct.
Try forcing UTF-8 when reading the file. It's good practice, regardless, to specify the charset when converting bytes to text:
String xml = new String(
Files.readAllBytes(xmlFile.toPath(), StandardCharsets.UTF_8));
Upvotes: 7
Reputation: 183
See if this works for you.
//filename is filepath string
BufferedReader br = new BufferedReader(new FileReader(new File(filename)));
String line;
StringBuilder sb = new StringBuilder();
while((line=br.readLine())!= null){
sb.append(line.trim());
}
Upvotes: 0
Reputation: 29
try this :
String xmlToString=FileUtils.readFileToString(new File("/file/path/file.xml"));
You need to have Commons-io jar for this.
Upvotes: 0