Reputation: 5805

How to check equivalence of two XML documents?

From my program I call a command line XSLT processor (such Saxon or xsltproc).

Then (for testing purposes) I want to compare the output of the processor with a predefined string.

The trouble is that XML can be formatted differently. The following three are different strings:

<?xml version="1.0" encoding="utf-8"?>
<x/>

<?xml version="1.0"?>
<x/>

<?xml version="1.0"?>
<x
/>

How to check output from different XSLT processors to match a given XML string?

Maybe there is a way (not necessarily standartized) for different XSLT processors to output exactly the same?

I use Python 3.

Upvotes: 1

Answers (3)

porton

Reputation: 5805

It can be easily done with minidom:

from unittest import TestCase

from defusedxml.minidom import parseString


class XmlTest(TestCase):
    def assertXmlEqual(self, got, want):
        return self.assertEqual(parseString(got).toxml(), parseString(want).toxml())

Upvotes: 0

Michael Kay

Reputation: 163468

Have you looked at using a testing framework like XSpec that already addresses this issue?

Typically the two classic ways of solving this are to compare the serialized XML lexically after putting it through a canonicalizer, or to compare the tree representations using a function such as XPath 2.0 deep-equal().

Neither of these is a perfect answer. Firstly, the things which XML canonicalization considers to be significant or insignificant may not be the same as the things you consider significant or insignificant; and the same goes for XPath deep-equal(). Secondly, you really want to know not just whether the files are the same, but where the differences are.

Saxon has an enhanced version of deep-equal() called saxon:deep-equal() designed to address these issues: it takes a set of flags that can be used to customize the comparison, and it tries to tell you where the differences are in terms of warning messages. But it's not a perfect solution either.

In the W3C test suites for XSLT 3.0 and XQuery we've moved away from comparing XML outputs of tests to writing assertions against the expected results in terms of XPath expressions. The tests use assertions like this:

  <result>
     <all-of>
        <assert>every $a in /out/* except /out/a4 
                satisfies $a/@actual = $a/@expected</assert>
        <assert>/out/a4/@actual = 'false'</assert>
     </all-of>
  </result>

Upvotes: 3

E.Serra

Reputation: 1574

Do you care about the order? IF NOT:

Convert them to a dictionary then run deepdiff on them!

Upvotes: 0

How to check equivalence of two XML documents?

Answers (3)

Related Questions