jepulis
jepulis

Reputation: 15

python, how to find parent element

I have XML file with contents

<work>
  <person>
    <name>Jim</name>
    <id>100</id>
    <supervisor></supervisor>
  </person>
  <person>
    <name>Jack</name>
    <id>101</id>
    <supervisor>100</supervisor>
  </person>
  <person>
    <name>Joe</name>
    <id>102</id>
    <supervisor>101</supervisor>
  </person>
  <person>
    <name>John</name>
    <id>103</id>
    <supervisor>102</supervisor>
  </person>
</work>

and I would to loop though all persons to find out who is the top boss. For example Joe, his direct supervisor is Jack but I would like to find out the top of the hierarchy, which is Jim.

So, something like

for person in persons
  top_boss=find_top_boss(supervisor)
  print name,top_boss

find_top_boss(supervisor) needs to go up in the hierarchy until Jim is found, probably need to recursively call itself.

I need to return list (name, top boss):

I'm using python and whatever module that provides the tools, now trying with LXML.

I'm now at the beginning and I'm onl y able to loop through the persons but don't know how to even search the supervisor? My knowledge about python, lxml or xpath is very limited.

from lxml import etree
tree = etree.parse("work.xml")
for person in tree.xpath('//person'):
  # search supervisor for the person
  s = person.xpath("//id[text()=supervisor-element-value]")[0]
  print s.text    

So, questions:

  1. How to use value of supervisor element from current person item in xpath search?
  2. If I can find the supervisor, let's say that I'll use static value in xpath

    s = person.xpath("//id[text()='101']")[0]

and I'll find Jack. How I can get Jacks value of the Jacks supervisor element Do I need first find the Jacks parent element or how?

Upvotes: 0

Views: 577

Answers (1)

alecxe
alecxe

Reputation: 473753

I'd use xmltodict package to dump the XML into a python data structure and then operate with it.

Working example (it is not perfect in terms of the algorithm, but should give you a starting point):

from collections import OrderedDict
import xmltodict

data = """
<work>
  <person>
    <name>Jim</name>
    <id>100</id>
    <supervisor></supervisor>
  </person>
  <person>
    <name>Jack</name>
    <id>101</id>
    <supervisor>100</supervisor>
  </person>
  <person>
    <name>Joe</name>
    <id>102</id>
    <supervisor>101</supervisor>
  </person>
  <person>
    <name>John</name>
    <id>103</id>
    <supervisor>102</supervisor>
  </person>
</work>
"""

d = xmltodict.parse(data)

persons = OrderedDict((person['id'], person) for person in d['work']['person'])

def get_supervisor(person):
    if not person['supervisor']:
        return 'null'
    else:
        supervisor = persons[person['supervisor']]
        if not supervisor['supervisor']:
            return supervisor['name']
        else:
            return get_supervisor(supervisor)

for person in persons.itervalues():
    print person['name'], get_supervisor(person)

Prints:

Jim null
Jack Jim
Joe Jim
John Jim

Upvotes: 1

Related Questions