Reputation: 11
I am having some issues using xmldiff package. I'm using xmldiff package 0.9.2; PHP 5.4.17; Apache 2.2.25.
For example I have two xml files: "from.xml" & "to.xml".
File "from.xml" contains:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<rott>
<NDC>321</NDC>
<NDC>123</NDC>
</rott>
</root>
File "to.xml" contains:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<rott>
<NDC>123</NDC>
<NDC>321</NDC>
</rott>
</root>
I'm using code:
$zxo = new XMLDiff\File;
$dir1 = dirname(__FILE__) . "/upload/from.xml";
$dir2 = dirname(__FILE__) . "/upload/to.xml";
$diff = $zxo->diff($dir1, $dir2);
$file = 'differences.xml';
file_put_contents($file, $diff);
I get result in "differences.xml" file:
<?xml version="1.0"?>
<dm:diff xmlns:dm="http://www.locus.cz/diffmark">
<root>
<rott>
<dm:delete>
<NDC/>
</dm:delete>
<dm:copy count="1"/>
<dm:insert>
<NDC>321</NDC>
</dm:insert>
</rott>
</root>
</dm:diff>
Could you please comment from where this:
<dm:delete>
<NDC/>
</dm:delete>
comes?
Also please kindly inform me if there is a method which differs two xml files without matter of xml nodes order?
Upvotes: 1
Views: 2215
Reputation: 690
If the document structure is known, I think you can simply sort the necessary parts. Here's a useful acticle about it. Based on it, I've poked on some examples and could sort a document by node values (just for example), please look here
document library.xml
<?xml version="1.0"?>
<library>
<book id="1003">
<title>Jquery MVC</title>
<author>Me</author>
<price>500</price>
</book>
<book id="1001">
<title>Php</title>
<author>Me</author>
<price>600</price>
</book>
<book id="1002">
<title>Where to use IFrame</title>
<author>Me</author>
<price>300</price>
</book>
<book id="1002">
<title>American dream</title>
<author>Hello</author>
<price>300</price>
</book>
</library>
The PHP code, sorting by the <title>
<?php
$dom = new DOMDocument();
$dom->load('library.xml');
$xp = new DOMXPath($dom);
$booklist = $xp->query('/library/book');
$books = iterator_to_array($booklist);
function sort_by_title_node($a, $b)
{
$x = $a->getElementsByTagName('title')->item(0);
$y = $b->getElementsByTagName('title')->item(0);
return strcmp($x->nodeValue, $y->nodeValue) > 0;
}
usort($books, 'sort_by_title_node');
$newdom = new DOMDocument("1.0");
$newdom->formatOutput = true;
$root = $newdom->createElement("library");
$newdom->appendChild($root);
foreach ($books as $b) {
$node = $newdom->importNode($b,true);
$root->appendChild($newdom->importNode($b,true));
}
echo $newdom->saveXML();
And here's the result:
<?xml version="1.0"?>
<library>
<book id="1002">
<title>American dream</title>
<author>Hello</author>
<price>300</price>
</book>
<book id="1003">
<title>Jquery MVC</title>
<author>Me</author>
<price>500</price>
</book>
<book id="1001">
<title>Php</title>
<author>Me</author>
<price>600</price>
</book>
<book id="1002">
<title>Where to use IFrame</title>
<author>Me</author>
<price>300</price>
</book>
</library>
This way you can sort the parts of the document before comparing. After that you can even use the DOM comparison directly. Even you could reorder the nodes, it were a similar approach.
I'm not sure it'll be very useful in the case if you have a variable node number. Say if the <NDC> tag were repeated some random number of times and it's values were completely different.
And after all, I still think the simplest way were to ask your supplicant to create some more predictable document structure :)
Thanks
Anatol
Upvotes: 1
Reputation: 690
What you see is the diff in the libdiffmark format. Right from that page:
<copy/> is used in places where the input subtrees are the same
The documents from your snippet have partially identical sub trees. Effectively the instructions libdiffmark will execute are
The order of the nodes matters. Please think about how a diff would look like, if the node order were ignored. Say you had 42 nodes and some of those were the same, how it would apply the copy instruction with the count? Much easier for a diff to use the exact node order of two documents. One interesting reading I've found here about why node order can be important.
Thanks.
Upvotes: 2