phappy
phappy

Reputation: 75

Python parse and modify XML elements and subelements

I'm using ElementTree to parse and modify my XML-File with the structure below. The actual file is much bigger Platz_1 to Platz_250 but the structure is the same. Now I want to set all elements.text and subelements.text of Platz_X to "0" at once, when the element.text of "_Name" of Platz_X is None and continue with the next Platz_X+1

My problem is when i parse through the file in a loop to check all the values, I don't know how to stop my loop set all the texts to "0" and continue with the next Platz_X+1.


tree = ET.parse(xml)
root = tree.getroot()
wkz = root.getchildren()

for sub_wkz in wkz:
   for platz in sub_wkz:
      for child in platz:
        if child.text:
           if len(child.text.split()) > 0:
              var = child.text

        for subchild in child:
           if subchild.text:
              if len(child.text.split()) > 0:
                 var_sub = subchild.text
<?xml version='1.0' encoding='utf-8'?>
<Maschine>
  <INDUSTRIE_WKZ_1>
    <Platz_1>
      <_Name>6006003</_Name>
      <_Duplo>1</_Duplo>
      <_Zustand>131</_Zustand>
      <Schneide_1>
        <_Sollstandzeit>60,0</_Sollstandzeit>
        <_Iststandzeit>50,58213424682617</_Iststandzeit>
        <_Vorwarngrenze>10,0</_Vorwarngrenze>
        <_Laenge_L1>237,89599609375</_Laenge_L1>
        <_Laenge_L2>0</_Laenge_L2>
        <_Radius>0</_Radius>
      </Schneide_1>
      <Schneide_2>
        <_Sollstandzeit>0</_Sollstandzeit>
        <_Iststandzeit>0</_Iststandzeit>
        <_Vorwarngrenze>0</_Vorwarngrenze>
        <_Laenge_L1>0</_Laenge_L1>
        <_Laenge_L2>0</_Laenge_L2>
        <_Radius>0</_Radius>
      </Schneide_2>
      <Schneide_3>
        <_Sollstandzeit>0</_Sollstandzeit>
        <_Iststandzeit>0</_Iststandzeit>
        <_Vorwarngrenze>0</_Vorwarngrenze>
        <_Laenge_L1>0</_Laenge_L1>
        <_Laenge_L2>0</_Laenge_L2>
        <_Radius>0</_Radius>
      </Schneide_3>
      <Schneide_4>
        <_Sollstandzeit>0</_Sollstandzeit>
        <_Iststandzeit>0</_Iststandzeit>
        <_Vorwarngrenze>0</_Vorwarngrenze>
        <_Laenge_L1>0</_Laenge_L1>
        <_Laenge_L2>0</_Laenge_L2>
        <_Radius>0</_Radius>
      </Schneide_4>
    </Platz_1>
  <INDUSTRIE_WKZ_1>
<Maschine>

Upvotes: 0

Views: 126

Answers (1)

Lenormju
Lenormju

Reputation: 4378

I changed the XML you provided a bit :

  • added the missing slash (/) to the INDUSTRIE_WKZ_1 closing tag
  • added the missing slash (/) to the <Maschine>closing tag
  • removed the Schneide_2 through 4 for brevity (but it works fine with it)
  • added a Platz_2 whose _Name is empty (if that is what you mean by "is None") in an INDUSTRIE_WKZ_2 (so the code works if there are multiple "WKZ")

This is the input file I used :

<?xml version='1.0' encoding='utf-8'?>
<Maschine>
  <INDUSTRIE_WKZ_1>
    <Platz_1>
      <_Name>6006003</_Name>
      <_Duplo>1</_Duplo>
      <_Zustand>131</_Zustand>
      <Schneide_1>
        <_Sollstandzeit>60,0</_Sollstandzeit>
        <_Iststandzeit>50,58213424682617</_Iststandzeit>
        <_Vorwarngrenze>10,0</_Vorwarngrenze>
        <_Laenge_L1>237,89599609375</_Laenge_L1>
        <_Laenge_L2>0</_Laenge_L2>
        <_Radius>0</_Radius>
      </Schneide_1>
    </Platz_1>
  </INDUSTRIE_WKZ_1>
  <INDUSTRIE_WKZ_2>
    <Platz_2>
      <_Name></_Name>
      <_Duplo>1</_Duplo>
      <_Zustand>131</_Zustand>
      <Schneide_1>
        <_Sollstandzeit>60,0</_Sollstandzeit>
        <_Iststandzeit>50,58213424682617</_Iststandzeit>
        <_Vorwarngrenze>10,0</_Vorwarngrenze>
        <_Laenge_L1>237,89599609375</_Laenge_L1>
        <_Laenge_L2>0</_Laenge_L2>
        <_Radius>0</_Radius>
      </Schneide_1>
    </Platz_2>
  </INDUSTRIE_WKZ_2>
</Maschine>

I assume there is only one Maschine and that it only contains INDUSTRIE_WKZ_* which contains Platz_*.

And here is my code :

from itertools import islice
from xml.etree.ElementTree import ElementTree as ET

src_xmlfile_name = "68253543.xml"
dst_xmlfile_name = "68253543_post.xml"

ET = ET()
root = ET.parse(src_xmlfile_name)
for platz_elem in root.findall("*/*"):  # all "Platz" children of "WKZ" children of the root
    platz_name_elem = platz_elem.find("_Name")
    if platz_name_elem.text is None:
        # we want to put to 0 all values in this Platz's descendants
        for platz_descendant in islice(platz_elem.iter(), 1, None):  # skip the first one, which is the "Platz" elem
            if (platz_descendant.tag != "_Name"  # keep "_Name
                    and platz_descendant.text is not None  # keep empty ones
                    and platz_descendant.text.strip() != ""):  #
                platz_descendant.text = "0"
ET.write(dst_xmlfile_name, encoding="utf-8", xml_declaration=True)

which produces this output :

<?xml version='1.0' encoding='utf-8'?>
<Maschine>
  <INDUSTRIE_WKZ_1>
    <Platz_1>
      <_Name>6006003</_Name>
      <_Duplo>1</_Duplo>
      <_Zustand>131</_Zustand>
      <Schneide_1>
        <_Sollstandzeit>60,0</_Sollstandzeit>
        <_Iststandzeit>50,58213424682617</_Iststandzeit>
        <_Vorwarngrenze>10,0</_Vorwarngrenze>
        <_Laenge_L1>237,89599609375</_Laenge_L1>
        <_Laenge_L2>0</_Laenge_L2>
        <_Radius>0</_Radius>
      </Schneide_1>
    </Platz_1>
  </INDUSTRIE_WKZ_1>
  <INDUSTRIE_WKZ_2>
    <Platz_2>
      <_Name />
      <_Duplo>0</_Duplo>
      <_Zustand>0</_Zustand>
      <Schneide_1>
        <_Sollstandzeit>0</_Sollstandzeit>
        <_Iststandzeit>0</_Iststandzeit>
        <_Vorwarngrenze>0</_Vorwarngrenze>
        <_Laenge_L1>0</_Laenge_L1>
        <_Laenge_L2>0</_Laenge_L2>
        <_Radius>0</_Radius>
      </Schneide_1>
    </Platz_2>
  </INDUSTRIE_WKZ_2>
</Maschine>

(including the XML declaration in the output file is based on this answer)

Upvotes: 1

Related Questions