Reputation: 77
I am new python. I have some predefined xml files. I have a script which generate new xml files. I want to write an automated script which compares xmls files and stores the name of differing xml file names in output file? Thanks in advance
Upvotes: 0
Views: 13963
Reputation: 1
What about the following snippet :
def separator(self):
return "!@#$%^&*" # Very ugly separator
def _traverseXML(self, xmlElem, tags, xpaths):
tags.append(xmlElem.tag)
for e in xmlElem:
self._traverseXML(e, tags, xpaths)
text = ''
if (xmlElem.text):
text = xmlElem.text.strip()
xpaths.add("/".join(tags) + self.separator() + text)
tags.pop()
def _xmlToSet(self, xml):
xpaths = set() # output
tags = list()
root = ET.fromstring(xml)
self._traverseXML(root, tags, xpaths)
return xpaths
def _areXMLsAlike(self, xml1, xml2):
xpaths1 = self._xmlToSet(xml1)
xpaths2 = self._xmlToSet(xml2)
return xpaths1 == xpaths2
Upvotes: 0
Reputation: 2656
Do you speak of comparing them byte-wise or for semantic equality?
(Is <tag attr1="1" attr2="2" />
equal to <tag attr2="2" attr1="1" />
?)
If you want to check for semantic equality have a look at Xml comparison in Python
When generating xml especially if using normal dicts for the attributes somewhere attribute order can be mixed up even sometimes when you use the same script with the same input.
items()
...
CPython implementation detail: Keys and values are listed in an arbitrary order which is non-random, varies across Python implementations, and depends on the dictionary’s history of insertions and deletions.
Upvotes: 2
Reputation: 14118
Building on @Xaranke's answer:
import filecmp
out_file = open("diff_xml_names.txt")
# Not sure what format your filenames will come in, but here's one possibility.
filePairs = [('f1a.xml', 'f1b.xml'), ('f2a.xml', 'f2b.xml'), ('f3a.xml', 'f3b.xml')]
for f1, f2 in filePairs:
if not filecmp.cmp(f1, f2):
# Files are not equal
out_file.write(f1+'\n')
out_file.close()
Upvotes: 0
Reputation:
I think you're looking for the filecmp
module. You can use it like this:
import filecmp
cmp = filecmp.cmp('f1.xml', 'f2.xml')
# Files are equal
if cmp:
continue
else:
out_file.write('f1.xml')
Replace f1.xml
and f2.xml
with your xml files.
Upvotes: 2