Reputation: 17548

Newline characters not recognized in XML attribute value? Java, DOM, JTextArea

I have an XML file, which I extract the Material.comment attribute from and populate a JTextArea with. (example XML below)

Problem is that it displays the text in one line, ignoring newlines.

This is strange because in all of my text editors, and XML viewers, it shows the comment attribute value having multiple lines.

Even stranger, is that when I analyze the string char by char (in java) it shows the end of each line to be only 2 space chars, no newline (ascii value 32).

How could notepad, notepad++, internet explorer, altova XML spy, etc. all interpret these invisible newlines? Is it possible java is ignoring characters? Perhaps it is a problem with java DOM parser?

<material 
    version="1.4" units="kg"
    comment="12AUG2012 -- An Extended Summary plot was added which includes the Monitor value.  J. Doe
15AUG2012 -- - Added  summary plot definition as requested.  J. Doe
27JAN2013 -- Fixed spacing issues between title and headings   J. Doe
03MAR2013 -- Added longName property to material file.  Updated summary plot legends with new heading convention, i.e. Mean Area and Area of Concern.  J. Doe

Upvotes: 1

Answers (1)

Daniel Camarda

Reputation: 1126

There is nothing wrong with Java DOM parser, it is the intended way to process white space characters inside attributes.

Quoting W3C Recommendation for XML 1.0

Before the value of an attribute is passed to the application or checked for validity, the XML processor must normalize the attribute value by applying the algorithm below, or by using some other method such that the value passed to the application is the same as that produced by the algorithm.

The algorithm is described in the previous link, but basically it converts all "white space" characters into spaces, so is normal to lost new line characters after parsing the xml.

The reason Notepad++ and others show the new line, is because they are not parsing the xml but showing directly the string, which has not been processed and mantains the original new line characters.

Upvotes: 4

Newline characters not recognized in XML attribute value? Java, DOM, JTextArea

Answers (1)

Related Questions