Reputation: 3459
I am using xml.etree.ElementTree as ET, this seems like the go-to library but if there is something else/better for the job I'm intrigued.
Let's say I have a tree like:
doc = """
<top>
<second>
<third>
<subthird></subthird>
<subthird2>
<subsubthird>findme</subsubthird>
</subthird2>
</third>
</second>
</top>"""
and for the sake of this problem, let's say this is already in an elementree named myTree
I want to update findme
to found
, is there a simple way to do it other than iterating like:
myTree.getroot().getchildren()[0].getchildren()[0].getchildren() \
[1].getchildren()[0].text = 'found'
The issue is I have a large xml tree and I want to update these values and I can't find a clear and pythonic way to do this.
Upvotes: 0
Views: 1190
Reputation: 77347
I use lxml
with XPath expressions. ElementTree
has an abbreviated XPath syntax but since I don't use it, I don't know how extensive it is. The thing about XPath is that you can write as complex an element selector as you need. In this example, its based on nesting:
import lxml.etree
doc = """
<top>
<second>
<third>
<subthird></subthird>
<subthird2>
<subsubthird>findme</subsubthird>
</subthird2>
</third>
</second>
</top>"""
root = lxml.etree.XML(doc)
for elem in root.xpath('second/third/subthird2/subsubthird'):
elem.text = 'found'
print(lxml.etree.tostring(root, pretty_print=True, encoding='unicode'))
But suppose there was something else identifying, such as a unique attribute,
<subthird2 class="foo"><subsubthird>findme</subsubthird></subthird2>
then you xpath would be //subthird2[@class="foo"]/subsubthird
.
Upvotes: 0
Reputation: 629
You can use XPath expressions to get a specific tagname like this:
for el in myTree.getroot().findall(".//subsubthird"):
el.text = 'found'
If you need to find all tags with a specific text value, take a look at this answer: Find element by text with XPath in ElementTree.
Upvotes: 1