Łukasz
Łukasz

Reputation: 29

XML remove child based on the subelement value in Python

I'm trying to remove multiple childs 'Traded' from the root based on the subelement 'id' value. I have list of ids and would like to have loop which removes entire child 'Traded'.

The XML code is following.

<holdings>
  <Traded>
    <positionName>position1</positionName>
    <id type="ISIN" exchange="BB">BE154</id>
    <amount>400</amount>
    <price currency="EUR">44.000000000000</price>
  </Traded>
  <Traded>
    <positionName>position2</positionName>
    <id type="ISIN" exchange="FP">FR200</id>
    <amount>200</amount>
    <price currency="EUR">58.240000000000</price>
  </Traded>
  <Traded>
    <positionName>position3</positionName>
    <id type="ISIN" exchange="UN">US400</id>
    <amount>100</amount>
    <price currency="USD">413.310000000000</price>
  </Traded>
  <Traded>
    <positionName>position4</positionName>
    <id type="ISIN" exchange="UN">US15</id>
    <amount>50000</amount>
    <price currency="USD">20.120000000000</price>
  </Traded>
</holdings>

Python code which doesn't work:

import xml.etree.ElementTree as ET

tree = ET.parse("positions.xml")
root = tree.getroot()

removelist = ["BE154", "FR200", "US400", "US15"]

for child in root:
    for subelement in root.iter("id"):
        for subelement.text in removelist:
            child.remove

Any help will be appreciated.

Upvotes: 0

Views: 49

Answers (1)

AKX
AKX

Reputation: 169032

This seems to do the trick. (I've embedded the XML for the sake of an encapsulated example.)

import xml.etree.ElementTree as ET

root = ET.fromstring(
    """
<holdings>
  <Traded>
    <positionName>position1</positionName>
    <id type="ISIN" exchange="BB">BE154</id>
    <amount>400</amount>
    <price currency="EUR">44.000000000000</price>
  </Traded>
  <Traded>
    <positionName>position2</positionName>
    <id type="ISIN" exchange="FP">FR200</id>
    <amount>200</amount>
    <price currency="EUR">58.240000000000</price>
  </Traded>
  <Traded>
    <positionName>position3</positionName>
    <id type="ISIN" exchange="UN">US400</id>
    <amount>100</amount>
    <price currency="USD">413.310000000000</price>
  </Traded>
  <Traded>
    <positionName>position4</positionName>
    <id type="ISIN" exchange="UN">US15</id>
    <amount>50000</amount>
    <price currency="USD">20.120000000000</price>
  </Traded>
</holdings>
"""
)

removelist = {"BE154", "FR200", "US400", "US15"}

for child in root[:]:  # `[:]` to take a copy of the list since we'd modify it
    id_element = child.find("id")
    if id_element is None:  # no ID, so keep this one
        continue
    trade_id = id_element.text
    if trade_id in removelist:
        root.remove(child)

print(ET.tostring(root, encoding="unicode"))

The output is

<holdings>
  <Traded>
    <positionName>position2</positionName>
    <id type="ISIN" exchange="FP">FR200</id>
    <amount>200</amount>
    <price currency="EUR">58.240000000000</price>
  </Traded>
  <Traded>
    <positionName>position4</positionName>
    <id type="ISIN" exchange="UN">US15</id>
    <amount>50000</amount>
    <price currency="USD">20.120000000000</price>
  </Traded>
</holdings>

Upvotes: 2

Related Questions