Jonathan
Jonathan

Reputation: 175

Modify specific subelement in xml with Python

I am starting to play around with Python but hit a wall in xml.

I am trying to edit a xml sub-element that maybe is stored in a not very conventional way, it has many numbers as text and more like a vector with all sub-elements named as 'double' although it is actually text...then I was expecting the xml to be.

This is a example of such file

<simulation>
         <element1>'A'</element1>
         <element2>
              <subelement1>
                   <double>1</double>
                   <double>2</double>
                   <double>3</double>
                   <double>4</double>
                   <double>5</double>
              </subelement1>
              <subelement2>
                   <double>1</double>
                   <double>2</double>
                   <double>3</double>
                   <double>4</double>
                   <double>5</double>
              </subelement2>
         </element2>
</simulation>

What I want to do is to change all child nodes values from subelement1 for let's say: 10, 20, 30, 40, 50 having something like this in the end:

<simulation>
         <element1>'A'</element1>
         <element2>
              <subelement1>
                   <double>10</double>
                   <double>20</double>
                   <double>30</double>
                   <double>40</double>
                   <double>50</double>
              </subelement1>
              <subelement2>
                   <double>1</double>
                   <double>2</double>
                   <double>3</double>
                   <double>4</double>
                   <double>5</double>
              </subelement2>
         </element2>
</simulation>

Can access all the nodes that I want to change with this:

import xml.etree.ElementTree as ET

for elem in root:
    for subelem in elem.findall('.//element1/double'):
        print(subelem.attrib)
        print(subelem.text)

This shows the numbers I want to change (see below), but I could not find a way to actually change them to the ones I need.

{} 1    {} 2    {} 3    {} 4    {} 5

If I try to use it as a vector or something like this:

for elem in root:
    for subelem in elem.findall('.//element1/double'):
        subelem.text = [10,20,30,40,50]
        print(subelem.text)

I end up not substituting, but adding information and the results are:

{} 1 [10,20,30,40,50]
{} 2 [10,20,30,40,50]
{} 3 [10,20,30,40,50]
{} 4 [10,20,30,40,50]
{} 5 [10,20,30,40,50]

What would be a way to make the changes? Thank you very much.

Upvotes: 1

Views: 1429

Answers (1)

tdelaney
tdelaney

Reputation: 77407

Assignment to the element's text attribute replaces the value, it doesn't append. There must have been something wrong in your test code.

You need to make sure you assign a string. ET will accept a number or a list, any object really, but will crash later when you try to serialize the tree. Also, there is no need to enumerate the first level of elements before findall, .// tells it to search the entire subtree.

import xml.etree.ElementTree as ET

xmltext = """<simulation>
         <element1>'A'</element1>
         <element2>
              <subelement1>
                   <double>1</double>
                   <double>2</double>
                   <double>3</double>
                   <double>4</double>
                   <double>5</double>
              </subelement1>
              <subelement2>
                   <double>1</double>
                   <double>2</double>
                   <double>3</double>
                   <double>4</double>
                   <double>5</double>
              </subelement2>
         </element2>
</simulation>"""

root = ET.fromstring(xmltext)

# to apply a function to each text node
#for subelem in root.findall('.//subelement1/double'):
#    subelem.text = str(int(subelem.text) * 10)

# to replace a known number of text nodes
for subelem in root.findall('.//subelement1/[double]'):
    new_doubles = [10, 20, 30, 40 ,50]
    for elem, dbl in zip(subelem.findall('double'), new_doubles):
        elem.text = str(dbl)
    break

print(ET.tostring(root, encoding="utf-8").decode('utf-8'))

Prints

<simulation>
         <element1>'A'</element1>
         <element2>
              <subelement1>
                   <double>10</double>
                   <double>20</double>
                   <double>30</double>
                   <double>40</double>
                   <double>50</double>
              </subelement1>
              <subelement2>
                   <double>1</double>
                   <double>2</double>
                   <double>3</double>
                   <double>4</double>
                   <double>5</double>
              </subelement2>
         </element2>
</simulation>

Upvotes: 2

Related Questions