Reputation: 297
Consider this XML structure, a dumbed-down version of the DDEX standard:
<doc>
<master>
<ResourceInfo>
<Name>Foo</Name>
<Seq>1</Seq>
</ResourceInfo>
<ResourceInfo>
<Name>Bar</Name>
<Seq>2</Seq>
</ResourceInfo>
</master>
<track>
<Resource>
<Name>Foo</Name>
</Resource>
</track>
<track>
<Resource>
<Name>Bar</Name>
</Resource>
</track>
</doc>
I'd like to select the ResourceInfo node in <master>
with a child <Name>
matching the text value of the Name of each of the track node to get the Seq number.
I can do so directly by getting an lxml tree of each track and explicitly requesting <ResourceInfo>
's like this:
track.xpath('/doc/master/ResourceInfo/Seq[../Name[text()="Foo"]]')
But that assumes I know the name of each track and can explicitly state it ahead of time. I'd like to be able to dumbly map this and somehow replace the "Foo" in the xpath with some reference to the Name text()
of current track's Resource.
It's kind of like joining tracks and resources on the text()
of the Name in master with the text()
of Name in each track. Is there an easy way of doing this with XPath?
I'm trying to iterate over each track, and pull the Seq from the track. Therefore, I can't explicitly ask for "Foo". I need to introspect - "Give me the Seq that is a sibling of a <Name>
node in master with a value matching <Name>
of the current node in <track>
".
Upvotes: 1
Views: 173
Reputation: 1605
After reading your comment I now understand what you are after. Uou can simply use Python to do the join:
from lxml import etree
doc = etree.parse('sample.xml')
# gather resources
resources = {}
for element in doc.xpath('/doc/master/ResourceInfo'):
name = element[0].text
seq = element[1].text
resources[name] = seq
# gather tracks
tracks = []
for element in doc.xpath('/doc/track/Resource/Name'):
name = element.text
tracks.append(name)
# join:
for track in tracks:
print 'Track: %s, seq: %s' % (track, resources.get(track))
# returns:
# Track: Foo, seq: 1
# Track: Bar, seq: 2
Previous Answer:
The XML was not well formed:
<doc>
<master>
<ResourceInfo>
<Name>Foo</Name>
<Seq>1</Seq>
</ResourceInfo>
<ResourceInfo>
<Name>Bar</Name>
<Seq>2</Seq>
</ResourceInfo>
</master>
<track>
<Resource>
<Name>Foo</Name>
</Resource>
</track> <!-- was missing backslash -->
<track>
<Resource>
<Name>Bar</Name>
</Resource>
</track>
</doc>
Your code works:
from lxml import etree
doc = etree.parse('a.xml')
for element in doc.xpath('/doc/master/ResourceInfo/Seq[../Name[text()="Foo"]]'):
#print etree.tostring(element)
print element.text
# returns
# 1
Upvotes: 1
Reputation: 52888
I'm not sure if I understand completely, but if the current context is:
/doc/track/Resource/Name
and you use the following XPath:
/doc/master/ResourceInfo[Name = current()]/Seq
you should get the Seq
of the ResourceInfo
of the same Name
.
Upvotes: 2