Rob
Rob

Reputation: 3459

search entire tree with etree

I am using xml.etree.ElementTree as ET, this seems like the go-to library but if there is something else/better for the job I'm intrigued.

Let's say I have a tree like:

doc = """
<top>
<second>
<third>
    <subthird></subthird>
    <subthird2>
         <subsubthird>findme</subsubthird>
    </subthird2>
</third>
</second>
</top>"""

and for the sake of this problem, let's say this is already in an elementree named myTree

I want to update findme to found, is there a simple way to do it other than iterating like:

myTree.getroot().getchildren()[0].getchildren()[0].getchildren() \
    [1].getchildren()[0].text = 'found'

The issue is I have a large xml tree and I want to update these values and I can't find a clear and pythonic way to do this.

Upvotes: 0

Views: 1190

Answers (2)

tdelaney
tdelaney

Reputation: 77347

I use lxml with XPath expressions. ElementTree has an abbreviated XPath syntax but since I don't use it, I don't know how extensive it is. The thing about XPath is that you can write as complex an element selector as you need. In this example, its based on nesting:

import lxml.etree 

doc = """
<top>
<second>
<third>
    <subthird></subthird>
    <subthird2>
         <subsubthird>findme</subsubthird>
    </subthird2>
</third>
</second>
</top>"""

root = lxml.etree.XML(doc)
for elem in root.xpath('second/third/subthird2/subsubthird'):
    elem.text = 'found'

print(lxml.etree.tostring(root, pretty_print=True, encoding='unicode'))

But suppose there was something else identifying, such as a unique attribute,

<subthird2 class="foo"><subsubthird>findme</subsubthird></subthird2>

then you xpath would be //subthird2[@class="foo"]/subsubthird.

Upvotes: 0

Aaron Perley
Aaron Perley

Reputation: 629

You can use XPath expressions to get a specific tagname like this:

for el in myTree.getroot().findall(".//subsubthird"):
    el.text = 'found'

If you need to find all tags with a specific text value, take a look at this answer: Find element by text with XPath in ElementTree.

Upvotes: 1

Related Questions