Reputation: 6350
I have an in-memory python XML ElementTree which looks like
<A>
<B>..</B>
<C>..</C>
<D>..</D>
</A>
I serialize the ElementTree into xml by
xmlstr = minidom.parseString(ET.tostring(root)).toprettyxml(" ")
The order of the inner nodes B,C,D changes every time i invoke the above tostring() method. How can i make sure my serialization will follow a deterministic order?
Upvotes: 3
Views: 2385
Reputation: 338406
I realize many answers here suggest this, but
minidom.parseString(ET.tostring(root)).toprettyxml(" ")
is actually a really horrible way of pretty-printing an XML file.
It involves parsing, serializing with ET and then parsing again and serializing again with a completely different XML library. It's silly and wasteful and I would not be surprised if minidom messes it up.
If you have it installed, switch to lxml and use its built-in pretty-printing function.
If you are for some reason stuck with xml.etree.ElementTree, you can use a simple recursive function to prettify a tree in-place:
# xmlhelpers.py
# taken from http://effbot.org/zone/element-lib.htm#prettyprint
def indent(elem, level=0):
i = "\n" + level*" "
if len(elem):
if not elem.text or not elem.text.strip():
elem.text = i + " "
if not elem.tail or not elem.tail.strip():
elem.tail = i
for elem in elem:
indent(elem, level+1)
if not elem.tail or not elem.tail.strip():
elem.tail = i
else:
if level and (not elem.tail or not elem.tail.strip()):
elem.tail = i
Usage is straight-forward:
import xml.etree.ElementTree as ET
from xmlhelpers import indent
root = ET.fromstring("<A><B>..</B><C>..</C><D>..</D></A>")
indent(root)
print( ET.tostring(root) )
This prints a nicely indented version:
b'<A>\n <B>..</B>\n <C>..</C>\n <D>..</D>\n</A>\n'
That being said, never use "tostring" to write an XML tree to a file.
Always write XML files with the functions provided by the XML library.
tree = ET.ElementTree(root) # only necessary if you don't already have a tree
tree.write(filename, encoding="UTF-8")
Upvotes: 2