glodnykojot
glodnykojot

Reputation: 19

How to refer in Python to XML value between tags

<rule>
    <vars>
          <value>*</value>
          <value>MAP</value>
          <value></value>
          <value>*</value>
          <value>*</value>
          <value>8795</value>
          <value>4</value>
          <value>*</value>
    </vars>
</rule>

This is a fragment of my XML file. I have to refer to number between value tags. I want to find and delete code between rule tags. I try to do like this:

x = input("if find this value delete rule block  ")
str(x)
for child in root.iter():
    for rule in child.findall('rule'):
        for vars in rule.findall('vars'):
            val = str(vars.find('value'))
            print(val)
            if val == x:
            root.remove(rule)
tree.write('output.xml')

So the problem is here: val = str(vars.find('value')), because when I run this code and print val PowerShell prints:

Element 'value' at 0x0328BFC0

for all value tags.

Upvotes: 0

Views: 123

Answers (1)

larsks
larsks

Reputation: 311635

First, I think your outer loop (for child in root.iter()) is not what you want, because that will iterate over all elements in your document. This is going to cause you to visit some nodes multiple times.

Secondly, you're seeing...

Element 'value' at 0x0328BFC0 

...because you're calling str on the result of vars.find('value'), and the find method returns elements, not strings. If you want the text content of an element, use the .text attribute. For example:

if value.text == x:
    ...

Lastly, you can only use the remove method on the parent of the element you're trying to remove, so calling root.remove() is never going to work.

Putting all the above together, we get something like:

from lxml import etree

doc = etree.parse('data.xml')
root = doc.getroot()
target = input('remove items with this value: ')
for rule in root.findall('rule'):
    for vars in rule.findall('vars'):
        for value in vars.findall('value'):
            if value.text == target:
                value.getparent().remove(value)

doc.write('output.xml')

I had to make some assumptions about your input document, so I tested it against the following data:

<?xml version="1.0"?>
<document>
  <rule>
    <vars>
      <value>*</value>
      <value>MAP</value>
      <value></value>
      <value>*</value>
      <value>*</value>
      <value>8795</value>
      <value>4</value>
      <value>*</value>
    </vars>
  </rule>
</document>

Upvotes: 1

Related Questions