Kevin Pasquarella
Kevin Pasquarella

Reputation: 89

Python XML ElementTree Removing All Elements

I'm writing a script that is supposed to remove parent elements from the XML file if the child element matches an element in a CSV file. The loops and if statements are working correctly, however when I add the remove, it just deletes everything out of the table regardless of if it matches or not. I can't seem to figure out why it is doing this.

cs = open('skus.csv', 'rb')
reader = csv.reader(cs)


tree = et.parse('christmas-dog-price.xml')
root = tree.getroot()
xmlns = {'pricebook': '{http://www.demandware.com/xml/impex/pricebook/2006-10-31}'}
price_table = root.find('.//{pricebook}price-table'.format(**xmlns))
product_id = [price_table.get('product-id') for price_table in root]
for sku in reader:
    for product in product_id:
        for price_table in root:
            if sku[0] != product:
                continue
            if sku[0] == product:
                root.remove(price_table)
            tree.write('please-work.xml')

Upvotes: 0

Views: 704

Answers (1)

Daniel
Daniel

Reputation: 42778

In your code, you get all product ids form xml and compare them with each id in your csv-file. If any matches, you remove every element from root.

Your code is equivalent to this:

for sku in reader:
    for product in product_id:
        if sku[0] == product:
            for price_table in root:
                root.remove(price_table)
tree.write('please-work.xml')

which is equivalent to this:

if any(sku[0] in product_id for sku in reader):
    for price_table in root:
        root.remove(price_table)
tree.write('please-work.xml')

You should compare only the current product-id which each id of the csv-file:

with open('skus.csv', 'rb') as cs:
    reader = csv.reader(cs)
    product_ids = [sku[0] for sku in reader]

tree = et.parse('christmas-dog-price.xml')
root = tree.getroot()
xmlns = {'pricebook': '{http://www.demandware.com/xml/impex/pricebook/2006-10-31}'}
price_table = root.find('.//{pricebook}price-table'.format(**xmlns))
to_be_removed = [element for element in price_table if price_table.get('product-id') in product_ids]
for element in to_be_removed:
    root.remove(element)

Upvotes: 1

Related Questions