Reputation: 15
I have XML file with contents
<work>
<person>
<name>Jim</name>
<id>100</id>
<supervisor></supervisor>
</person>
<person>
<name>Jack</name>
<id>101</id>
<supervisor>100</supervisor>
</person>
<person>
<name>Joe</name>
<id>102</id>
<supervisor>101</supervisor>
</person>
<person>
<name>John</name>
<id>103</id>
<supervisor>102</supervisor>
</person>
</work>
and I would to loop though all persons to find out who is the top boss. For example Joe, his direct supervisor is Jack but I would like to find out the top of the hierarchy, which is Jim.
So, something like
for person in persons
top_boss=find_top_boss(supervisor)
print name,top_boss
find_top_boss(supervisor) needs to go up in the hierarchy until Jim is found, probably need to recursively call itself.
I need to return list (name, top boss):
I'm using python and whatever module that provides the tools, now trying with LXML.
I'm now at the beginning and I'm onl y able to loop through the persons but don't know how to even search the supervisor? My knowledge about python, lxml or xpath is very limited.
from lxml import etree
tree = etree.parse("work.xml")
for person in tree.xpath('//person'):
# search supervisor for the person
s = person.xpath("//id[text()=supervisor-element-value]")[0]
print s.text
So, questions:
If I can find the supervisor, let's say that I'll use static value in xpath
s = person.xpath("//id[text()='101']")[0]
and I'll find Jack. How I can get Jacks value of the Jacks supervisor element Do I need first find the Jacks parent element or how?
Upvotes: 0
Views: 577
Reputation: 473753
I'd use xmltodict
package to dump the XML into a python data structure and then operate with it.
Working example (it is not perfect in terms of the algorithm, but should give you a starting point):
from collections import OrderedDict
import xmltodict
data = """
<work>
<person>
<name>Jim</name>
<id>100</id>
<supervisor></supervisor>
</person>
<person>
<name>Jack</name>
<id>101</id>
<supervisor>100</supervisor>
</person>
<person>
<name>Joe</name>
<id>102</id>
<supervisor>101</supervisor>
</person>
<person>
<name>John</name>
<id>103</id>
<supervisor>102</supervisor>
</person>
</work>
"""
d = xmltodict.parse(data)
persons = OrderedDict((person['id'], person) for person in d['work']['person'])
def get_supervisor(person):
if not person['supervisor']:
return 'null'
else:
supervisor = persons[person['supervisor']]
if not supervisor['supervisor']:
return supervisor['name']
else:
return get_supervisor(supervisor)
for person in persons.itervalues():
print person['name'], get_supervisor(person)
Prints:
Jim null
Jack Jim
Joe Jim
John Jim
Upvotes: 1