Reputation: 22317
I use libxml2 for parsing my XML configuration file. The newest feature request involves the "correct handling of meaningful whitespaces", e.g. newlines should be kept.
Currently I get the attribute values with xmlGetProp.
I know that usually the whitespaces are normalized by the XML parser -- as the standard requests it (replacing all whitespaces with space char, fusing multiple space chars, stripping leading and trailing space chars).
I wonder if there is a way how I can make sure the embedded newlines in the attributes are kept.
Upvotes: 0
Views: 1791
Reputation: 43421
Did you try the xml:space
attribute or the xmlNodeGetSpacePreserve() :
<para xml:space="preserve">
See :
Upvotes: 1
Reputation: 5652
As you note this is required by the XML spec, so there is no way in DTD or Schema to stop the normalisation.
You can probably use libxml's html parser though, using its command-line xmllint utility with an input file of
<a>
<b x="1
2
3"/>
</a>
I get
$ xmllint abc.xml
<?xml version="1.0"?>
<a>
<b x="1 2 3"/>
</a>
so the newlines have gone, but:
$ xmllint --html abc.xml
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><a>
<b x="1
2
3"></b>
</a></body></html>
Newlines kept (spurious inferred html and body added but you could lose them post parsing in your application).
Upvotes: 1