kiran kumar
kiran kumar

Reputation: 45

Need help in comparing 2 xml files using XMLUnit ignoring spaces in differences

I have a code in java comparing 2 xml files using XMLUnit. It is highlighting the differences for a extra space in one of the lines which I wish to ignore. How can I achieve this.

Below are the settings used.

XMLUnit.setCompareUnmatched(true);
XMLUnit.setIgnoreAttributeOrder(true);
XMLUnit.setIgnoreComments(true);
XMLUnit.setNormalize(true);

For Example: xml1:

<metadata><div><p>This is a test</p></div></metadata>

xml2:

<metadata><div><p>This is a test</p> </div></metadata>

The space between </p> and </div> tags is highlighted in above example.

Upvotes: 1

Views: 352

Answers (1)

LMC
LMC

Reputation: 12777

The space after the p tag is a text node by itself so it might not be possible to remove or ignore it by configuration.

xmllint --shell test.xml

/ > xpath /metadata/div/descendant-or-self::*/text()
Object is a Node Set :
Set contains 2 nodes:
1  TEXT
    content=This is a test
2  TEXT
    content=  
/ >

Getting each text node:

echo "'$(xmllint --xpath '(/metadata/div/descendant-or-self::*/text())[1]' test.xml)'"
'This is a test'

echo "'$(xmllint --xpath '(/metadata/div/descendant-or-self::*/text())[2]' test.xml)'"
'  '

From the normalize-space() docs

The normalize-space function strips leading and trailing white-space from a string, replaces sequences of whitespace characters by a single space, and returns the resulting string.

From setIgnoreWhitespace(boolean ignore) XMLUnit docs

Setting this parameter has no effect on whitespace inside texts.

Also on same docs

Normalized in this context means that all whitespace is replaced by the space character and adjacent whitespace characters are collapsed to a single space character. It will also trim the resulting character content on both ends.

Upvotes: 1

Related Questions