I'm looking for an API which compares two XML data. I've tried XMLUnit 2 but couldn't find a way to make it work properly with my example. Could you give me an example which works for my need? My first XML data xml1 : <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <urlset xmlns="http://www.sitemap.org/schemas/sitemap/0.9"> <url> <loc>a1/</loc> <lastmod>a2</lastmod> </url> <url> <loc>b1</loc> <lastmod>b2</lastmod> </url> <url> <loc>c1</loc> <lastmod>c2</lastmod> </url> </urlset> My second XML data xml2 : <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <urlset xmlns="http://www.sitemap.org/schemas/sitemap/0.9"> <url><lastmod>b2</lastmod><loc>b1</loc></url> <url> <lastmod>c2</lastmod> <loc>c1</loc> </url> <url> <loc>a1/</loc> <lastmod>a2</lastmod> </url> </urlset> Notice: Same size (here 3 children) urlset 's child nodes ( url ) may not be ordered url 's elements ( loc and lastmod ) may not be ordered White spaces are ignored Looking for an API which returns true like: XMLUtils.isSimilar(xml1, xml2); My unsuccessful attempts with XMLUnit 2 (tried with multiple "NodeMatcher"): // Attempt with XmlAssert.assertThat: XmlAssert.assertThat(xml1) .and(xml2) .ignoreChildNodesOrder() .ignoreWhitespace() .withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText)) .areSimilar(); // Attempt with Diff Diff myDiff = DiffBuilder.compare(xml1) .withTest(xml2) .ignoreWhitespace() .checkForSimilar() .withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText)) .build(); myDiff.getDifferences();

Reputation: 1078

Comparing two similar XML data with unordered elements/attributes in Java

I'm looking for an API which compares two XML data. I've tried XMLUnit 2 but couldn't find a way to make it work properly with my example. Could you give me an example which works for my need?

My first XML data xml1:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<urlset xmlns="http://www.sitemap.org/schemas/sitemap/0.9">
    <url>
        <loc>a1/</loc>
        <lastmod>a2</lastmod>
    </url>
    <url>
        <loc>b1</loc>
        <lastmod>b2</lastmod>
    </url>
    <url>
        <loc>c1</loc>
        <lastmod>c2</lastmod>
    </url>
</urlset>

My second XML data xml2:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<urlset xmlns="http://www.sitemap.org/schemas/sitemap/0.9">
    <url><lastmod>b2</lastmod><loc>b1</loc></url>
    <url>
        <lastmod>c2</lastmod>
        <loc>c1</loc>
    </url>
    <url>
        <loc>a1/</loc>
        <lastmod>a2</lastmod>
    </url>
</urlset>

Notice:

Same size (here 3 children)
urlset's child nodes (url) may not be ordered
url's elements (loc and lastmod) may not be ordered
White spaces are ignored

Looking for an API which returns true like:

XMLUtils.isSimilar(xml1, xml2);

My unsuccessful attempts with XMLUnit 2 (tried with multiple "NodeMatcher"):

// Attempt with XmlAssert.assertThat:
XmlAssert.assertThat(xml1)
    .and(xml2)
    .ignoreChildNodesOrder()
    .ignoreWhitespace()
    .withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText))
    .areSimilar();

// Attempt with Diff
Diff myDiff = DiffBuilder.compare(xml1)
    .withTest(xml2)
    .ignoreWhitespace()
    .checkForSimilar()
    .withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText))
     .build();
myDiff.getDifferences();

Upvotes: 0

Answers (2)

user4524982

Reputation:

The biggest problem likely is "what are the matching url elements?". I can only guess and assume for you the urls with the same text inside the loc child are the matching elements - and this is what you need to tell XMLUnit.

Your example is very common but still something that cannot be guessed (apart fom brute-forcing all possible permutations and picking the one with the least differences). It is the running example of https://github.com/xmlunit/user-guide/wiki/SelectingNodes you only need to replace tr with uri and th with loc.

To make things concrete. When comparing the url elements you want XMLUnit to look at the respective loc children and compare their nested text. In all other cases you are happy with selecting among sibling elements by their name (there is only one urlset and each pair of loc and lastmod siblings is uniquely decided by their tag names).

The translates to a conditional ElementSelector

ElementSelectors.conditionalBuilder()
    .whenElementIsNamed("url").thenUse(ElementSelectors
        .byXPath("./loc", ElementSelectors.byNameAndText))
    .elseUse(ElementSelectors.byName)
    .build();

With that you should be able to get down to a "similar" result where the only differences found are child order differences.

Upvotes: 0

Nghia Do

Reputation: 2668

You can try as below

public class XMLUtils {
    private static DocumentBuilderFactory documentBuilderFactory;
    private static DocumentBuilder documentBuilder;
    private static TransformerFactory transformerFactory;
    private static Transformer transformer;
    private static Document emptyDoc;

    public XMLUtils() {
    }

    public XMLCompareResult compare(File expectedFile, File actualFile, boolean ignoreWhiteSpace) throws FileNotFoundException, SAXException, IOException {
        FileInputStream expInpStream = new FileInputStream(expectedFile);
        FileInputStream actualInpStream = new FileInputStream(actualFile);
        Diff myDiff = null;
        if (ignoreWhiteSpace) {
            myDiff = DiffBuilder.compare(expInpStream).withTest(actualInpStream).checkForSimilar().ignoreWhitespace().withNodeMatcher(new DefaultNodeMatcher(new ElementSelector[]{ElementSelectors.byNameAndAllAttributes})).build();
        } else {
            myDiff = DiffBuilder.compare(expInpStream).withTest(actualInpStream).checkForSimilar().withNodeMatcher(new DefaultNodeMatcher(new ElementSelector[]{ElementSelectors.byNameAndAllAttributes})).build();
        }

        XMLResultUtil xmlr = new XMLResultUtil();
        XMLCompareResult xs = xmlr.prepareXMLCompareResult(myDiff.getDifferences());
        return xs;
    }  

    static {
        try {
            documentBuilderFactory = DocumentBuilderFactory.newInstance();
            documentBuilder = documentBuilderFactory.newDocumentBuilder();
            transformerFactory = TransformerFactory.newInstance();
            transformer = transformerFactory.newTransformer();
            emptyDoc = documentBuilder.newDocument();
        } catch (ParserConfigurationException var1) {
            var1.printStackTrace();
        } catch (TransformerConfigurationException var2) {
            var2.printStackTrace();
        }

    }
}

I copying here a method we are using in my project.

Could you try it and let me know if you are facing any issue. I can try by myself again.

Thank you

Upvotes: 2

Comparing two similar XML data with unordered elements/attributes in Java

Answers (2)

Related Questions