mardok
mardok

Reputation: 2415

Why those two xml string aren't similar

I have two similar xml string. I use XMLUnit to compare them but after I run some sample test to check them it says that they aren't similar and identical. I agree that they aren't identical but I think it should return true for similar. Below are my strings and test code that I run.

<Errors>
  <Error>
    <Sheet>Sample1</Sheet>
    <ErrorCode>4</ErrorCode>
    <Columns>
      <Column>Id</Column>
      <Column>Name</Column>
    </Columns>
  </Error>
  <Error>
    <Sheet>Sample2</Sheet>
    <ErrorCode>4</ErrorCode>
    <Columns>
      <Column>Id</Column>
      <Column>Name</Column>
    </Columns>
  </Error>
</Errors>

and

<Errors>
  <Error>
    <Sheet>Sample1</Sheet>
    <ErrorCode>4</ErrorCode>
    <Columns>  
      <Column>Name</Column>
      <Column>Id</Column>
    </Columns>
  </Error>
  <Error>
    <Sheet>Sample2</Sheet>
    <ErrorCode>4</ErrorCode>
    <Columns>
      <Column>Name</Column>
      <Column>Id</Column>
    </Columns>
  </Error>
</Errors>

The only difference is that Column nodes are reversed but i think it should return that both string are similar.

public void test() throws Exception{
    String myControlXML = "here goes xml1";
    String myTestXML = "here goes xml2";
    Diff myDiff = new Diff(myControlXML, myTestXML);

    System.out.println("pieces of XML are similar " +  myDiff.similar());
    System.out.println("but are they identical? " + myDiff.identical());
}

Upvotes: 1

Views: 278

Answers (1)

Pablo Lozano
Pablo Lozano

Reputation: 10342

Just guessing, but I think the problem is both tags have the same name. It sounds contradictory, but let me explain it:

<root>
    <field>John</field>
    <field>Smith</field>
</root>

<root>
    <field>Smith</field>
    <field>John</field>
</root>

For me these two pieces of XML are not similar as one says John Smith and the other one says Smith John

<person>
    <name>John</name>
    <surname>Smith</surname>
<person>
<person>
    <name>John</name>
    <surname>Smith</surname>
<person>

These others are similar: not identical but clearly both say John Smith

In other words: as @JustinKSU says, order matters.

UPDATE: From the XMLUnit Java User's Guide:Two pieces of XML are identical if there are no differences between them, similar if there are only recoverable differences between them, and different if there are any unrecoverable differences between them

My second example shows two similar XML pieces because differences are recoverable. The first one isn't because we don't know the correct order: Maybe there is a guy whose name is Smith, so we cannot be sure. Your example is pretty the same case: the parser cannot know if the columns order are important or not. Imagine that your XML is used to select how to order a SQL query:

SELECT * FROM table order by name, id is clearly not the same that SELECT * FROM table order by id, name

Upvotes: 2

Related Questions