user2910022
user2910022

Reputation: 53

Reading XML same name sub-element python

I am new to XML parsing and python too .I need to get to the tree subelements and print all of them.

I have an XML file which goes like this. Here is my file- https://gofile.io/?c=OXcdue

My requirement is to read all the queues which has subqueues and their subqueues.

Upvotes: 0

Views: 1731

Answers (2)

balderman
balderman

Reputation: 23815

Below (Using no external library)

import pprint
import xml.etree.ElementTree as ET

xml = '''<allocations>
    <queue name="bdpaas_express_q1">
      <minResources>12000 mb,2 vcores,1 disks</minResources>
      <maxResources>18000 mb,3 vcores,2 disks</maxResources>
      <aclSubmitApps> xyz</aclSubmitApps>
      <aclAdministerApps> xyz</aclAdministerApps>
      <label>allnodes</label>
    </queue>
    <queue name="dl_priority_q1">
      <minResources>8496000 mb,1416 vcores,108 disks</minResources>
      <maxResources>12768000 mb,2128 vcores,162 disks</maxResources>
      <aclSubmitApps> dla_grp</aclSubmitApps>
      <aclAdministerApps> dla_grp</aclAdministerApps>>
      <label>fastnodes</label>
    </queue>
    <queue name="pireporting_q1">
      <minResources>6960000 mb,1160 vcores,87 disks</minResources>
      <maxResources>10440000 mb,1740 vcores,130 disks</maxResources>
      <queue name="atscale_rtam_mr_sq1">
        <minResources>6000000 mb,1000 vcores,75 disks</minResources>
        <maxResources>9000000 mb,1500 vcores,112 disks</maxResources>
        <aclSubmitApps> atscalep</aclSubmitApps>
        <aclAdministerApps> atscalep</aclAdministerApps>
        <label>allnodes</label>
      </queue>
      <queue name="atscale_spark_sq1">
        <minResources>960000 mb,160 vcores,12 disks</minResources>
        <maxResources>1440000 mb,240 vcores,18 disks</maxResources>
        <aclSubmitApps> atscalep</aclSubmitApps>
        <aclAdministerApps> atscalep</aclAdministerApps>
        <label>allnodes</label>
      </queue>
    </queue>
  <queuePlacementPolicy>
    <rule create="false" name="specified" />
    <rule name="reject" />
  </queuePlacementPolicy>
</allocations>
'''


root = ET.fromstring(xml)
queues = root.findall('.//queue')
for queue in queues:
  if queue.find('./queue'):
    print(ET.tostring(queue, encoding='utf8', method='xml'))

output

<?xml version="1.0" encoding="UTF-8"?>
<queue name="pireporting_q1">
   <minResources>6960000 mb,1160 vcores,87 disks</minResources>
   <maxResources>10440000 mb,1740 vcores,130 disks</maxResources>
   <queue name="atscale_rtam_mr_sq1" />
   <queue name="atscale_spark_sq1" />
</queue>

Upvotes: 0

equatorial_daydreamer
equatorial_daydreamer

Reputation: 408

You can use the lxml library to parse any xml content. This library is better than the standard xml library as it allows you to get the namespace of the xml document if necessary (not needed in your case).

from lxml import etree
tree = etree.parse(path_to_xml_file)
root = tree.getroot()

for children in root.getchildren():
    print (children.tag)

    for child in children:
        print(child.tag, child.text)

Refer to the documentation here for more information on how to access various parts of your xml file and recursively finding all subelements.. This documentation is for the standard xml library but is also supported in the lxml library as lxml is built on top of xml.

Upvotes: 1

Related Questions