Python toprettyxml() formatting problems

Question

I'm trying to process XML using Python's minidom, and then output the result using toprettyxml(). I ran into two problems:

There are added blank lines.
There are added newlines and tabs for text nodes.

Here's the code and output:

$ cat test.py
from xml.dom import minidom

dom = minidom.parse("test.xml")
print dom.toprettyxml()

$ cat test.xml



    
        orange
    



$ python test.py




    


        
            orange

I can workaround problem 1 using strip() to remove blank lines, and I can workaround problem 2 using the hack (fixed_writexml) described in this link: http://ronrothman.com/public/leftbraned/xml-dom-minidom-toprettyxml-and-silly-whitespace/, but I was wondering if there's a better solution since the hack is almost 3 years old now. I'm open to using something other than minidom, but I'd like to avoid adding external packages like lxml.

CharlesB · Accepted Answer

One solution is to patch minidom Library with the proposed patch to the bug you mention.

I haven't tested myself, a bit hacky too, so it may not suit you!

Python toprettyxml() formatting problems

Answers (1)

Related Questions