Alucard
Alucard

Reputation: 317

Convert XML-File to string without manipulation or optimization in Java

I have some trouble with the JDOM2 whitch i use to work with XML files. I want to convert the XML file to a string without any manipulation or optimization.

Thats my Java code to do that:

SAXBuilder builder = new SAXBuilder();
    File xmlFile = f;

    try 
    {
        Document document = (Document) builder.build(xmlFile);

        xml = new XMLOutputter().outputString(document);

    } catch (Exception e) {
        System.out.println(e.getMessage());
    }

    return xml;

But when I compare my string with the original XML file I notice some changes.

The original:

<?xml version="1.0" encoding="windows-1252"?>
<xmi:XMI xmi:version="2.1" xmlns:uml="http://schema.omg.org/spec/UML/2.0" xmlns:xmi="http://schema.omg.org/spec/XMI/2.1" xmlns:thecustomprofile="http://www.sparxsystems.com/profiles/thecustomprofile/1.0" xmlns:SoaML="http://www.sparxsystems.com/profiles/SoaML/1.0">

And the string:

<?xml version="1.0" encoding="UTF-8"?>
<xmi:XMI xmlns:xmi="http://schema.omg.org/spec/XMI/2.1" xmlns:SoaML="http://www.sparxsystems.com/profiles/SoaML/1.0" xmlns:thecustomprofile="http://www.sparxsystems.com/profiles/thecustomprofile/1.0" xmlns:uml="http://schema.omg.org/spec/UML/2.0" xmi:version="2.1">

And all umlauts (ä, ö , ü) are changed too. I will get something like that: '�' instead of 'ä'.

Is there any way to stop that behaviore?

Upvotes: 1

Views: 18167

Answers (3)

VGR
VGR

Reputation: 44292

Firstly, as others have stated, you shouldn't use any XML processing. Just read the file as a text file.

Secondly, your umlaut characters showing up as '�' is due to an incorrect charset (encoding) being used. The charset error may be in your code, or it may be the XML file.

The original XML file contains encoding="windows-1252", but it's unusual for XML to be encoded in anything other than UTF-8, so I suspect the file is really a UTF-8 file and the encoding it claims to use is not correct.

Try forcing UTF-8 when reading the file. It's good practice, regardless, to specify the charset when converting bytes to text:

String xml = new String(
    Files.readAllBytes(xmlFile.toPath(), StandardCharsets.UTF_8));

Upvotes: 7

Droy
Droy

Reputation: 183

See if this works for you.

//filename is filepath string
BufferedReader br = new BufferedReader(new FileReader(new File(filename)));
String line;
StringBuilder sb = new StringBuilder();
while((line=br.readLine())!= null){
    sb.append(line.trim());
}

Upvotes: 0

Prateek1421
Prateek1421

Reputation: 29

try this :

String xmlToString=FileUtils.readFileToString(new File("/file/path/file.xml"));

You need to have Commons-io jar for this.

Upvotes: 0

Related Questions