stimko68
stimko68

Reputation: 67

Removing parent element and all subelements from XML

Given an XML file with the following structure:

<Root>
    <Stuff></Stuff>
    <MoreStuff></MoreStuff>
    <Targets>
        <Target>
            <ID>12345</ID>
            <Type>Ground</Type>
            <Size>Large</Size>
        </Target>
        <Target>
            ...
        </Target>
    </Targets>
</Root>

I'm trying to loop through each child under the <Targets> element, check each <ID> for a specific value, and if the value is found, then I want to delete the entire <Target> entry. I've been using the ElementTree Python library with little success. Here's what I have so far:

import xml.etree.ElementTree as ET

tree = ET.parse('file.xml')
root = tree.getroot()

iterator = root.getiterator('Target')

for item in iterator:
    old = item.find('ID')
    text = old.text
    if '12345' in text:
        item.remove(old)

tree.write('out.xml')

The problem I'm having with this approach is that only the <ID> sub element is removed, however I need the entire <Target> element and all of its child elements removed. Can anyone help! Thanks.

Upvotes: 2

Views: 4599

Answers (2)

tdelaney
tdelaney

Reputation: 77407

You need to keep a reference to the Targets element so that you can remove its children, so start your iteration from there. Grab each Target, check your condition and remove what you don't like.

#!/usr/bin/env python
import xml.etree.ElementTree as ET

xmlstr="""<Root>
    <Stuff></Stuff>
    <MoreStuff></MoreStuff>
    <Targets>
        <Target>
            <ID>12345</ID>
            <Type>Ground</Type>
            <Size>Large</Size>
        </Target>
        <Target>
            ...
        </Target>
    </Targets>
</Root>"""

root = ET.fromstring(xmlstr)

targets = root.find('Targets')

for target in targets.findall('Target'):
    _id = target.find('ID')
    if _id is not None and '12345' in _id.text:
        targets.remove(target)

print ET.tostring(root)

Upvotes: 4

mgilson
mgilson

Reputation: 310287

Unfortunately, element tree elements don't know who their parents are. There is a workaround -- You can build the mapping yourself:

tree = ET.parse('file.xml')
root = tree.getroot()
parent_map = dict((c, p) for p in tree.getiterator() for c in p)

# list so that we don't mess up the order of iteration when removing items.
iterator = list(root.getiterator('Target'))

for item in iterator:
    old = item.find('ID')
    text = old.text
    if '12345' in text:
        parent_map[item].remove(item)
        continue

tree.write('out.xml')

Untested

Upvotes: 7

Related Questions