brianmearns
brianmearns

Reputation: 9967

XML parser that contains debug information

I'm looking for an XML parser for python that includes some debug information in each node, for instance the line number and column where the node began. Ideally, it would be a parser that is compatible with xml.etree.ElementTree.XMLParser, i.e., one that I can pass to xml.etree.ElementTree.parse.

I know these parsers don't actually produce the elements, so I'm not sure how this would really work, but it seems like such a useful thing, I'll be surprised if no-body has one. Syntax errors in the XML are one thing, but semantic errors in the resulting structure can be difficult to debug if you can't point to a specific location in the source file/string.

Upvotes: 0

Views: 699

Answers (1)

Jan Vlcinsky
Jan Vlcinsky

Reputation: 44102

Point to an element by xpath (lxml - getpath)

lxml offers finding an xpath for an element in document.

Having test document:

>>> from lxml import etree
>>> xmlstr = """<root><rec id="01"><subrec>a</subrec><subrec>b</subrec></rec>
... <rec id="02"><para>graph</para></rec>
... </root>"""
...
>>> doc = etree.fromstring(xmlstr)
>>> doc
<Element root at 0x7f61040fd5f0>

We pick an element <para>graph</para>:

>>> para = doc.xpath("//para")[0]
>>> para
<Element para at 0x7f61040fd488>

XPath has a meaning, if we have clear context, in this case it is root of the XML document:

>>> root = doc.getroottree()
>>> root
<lxml.etree._ElementTree at 0x7f610410f758>

And now we can ask, what xpath leads from the root to the element of our interest:

>>> root.getpath(para)
'/root/rec[2]/para'

Upvotes: 1

Related Questions