Jayabalan Bala
Jayabalan Bala

Reputation: 1217

How to create nested dictionary from XML using python?

I am reading an XML file and take part of it and write it to YAML file. For example, in this xml file,

<project>


  <scm class="hudson.scm.NullSCM"/>
  <assignedNode>python</assignedNode>
  <canRoam>false</canRoam>
  <disabled>false</disabled>
  <blockBuildWhenDownstreamBuilding>false</blockBuildWhenDownstreamBuilding>
  <blockBuildWhenUpstreamBuilding>false</blockBuildWhenUpstreamBuilding>
  <triggers>
    <hudson.triggers.TimerTrigger>
      <spec>H * * * *</spec>
    </hudson.triggers.TimerTrigger>
  </triggers>
  <concurrentBuild>false</concurrentBuild>
  <builders>

I want to read only the disabled value and the spec value and write it to a YAML file like this: Expected output:

disabled: 'false'
name: Cancellation_CMT_Tickets
triggers:
  hudson.triggers.TimerTrigger:
    spec: H * * * *

Only when my resultant dictionary is in this format

d = {"trigger":{"hudson.triggers.TimerTrigger": {"spec": "H * * * *"}}}

I can dump that into yaml file with the above format. MY current code looks like this, search key is passed as runtime arguments

import os, xml.etree.ElementTree as ET
import yaml,sys
tree = ET.parse('test.xml')
root = tree.getroot()

d = {}
def xmpparse(root,searchkey):
    for child in root:
        if child.tag == searchkey:
            d[child.tag]=child.text
        elif len(child):
           xmpparse(child,searchkey)
for i in sys.argv:
    xmpparse(root,i)

print(yaml.dump(d, default_flow_style=False))

Current output:

disabled: 'false'
spec: H * * * *

Any help would be much appreciated. Thanks in advance!

Upvotes: 3

Views: 2643

Answers (1)

Jack Fleeting
Jack Fleeting

Reputation: 24930

I believe this should take care of the nested dictionary problem, at least; it's based on various answers on SO on how to form nested dictionaries (and there may be other methods):

    import lxml.html as LH

    class NestedDict(dict):
        def __missing__(self, key):
              self[key] = NestedDict()
              return self[key]

    data =     [your xml above]

    doc = LH.fromstring(data)

    for i in doc:
           if i.tag == 'triggers':
                for child in i.getchildren():
                    d = NestedDict()
                    d[i.tag][child.tag][child[0].tag] = i.text_content().strip()

    print(d)

Output:

{'triggers': {'hudson.triggers.timertrigger': {'spec': 'H * * * *'}}}

Upvotes: 1

Related Questions