Gritty Kitty
Gritty Kitty

Reputation: 297

How do I refer to the value of the child of a current node when selecting another node with XPath? (DDEX Related)

Consider this XML structure, a dumbed-down version of the DDEX standard:

<doc>
<master>
 <ResourceInfo>
  <Name>Foo</Name>
  <Seq>1</Seq>
 </ResourceInfo>
 <ResourceInfo>
  <Name>Bar</Name>
  <Seq>2</Seq>
 </ResourceInfo>
</master>
<track>
 <Resource>
  <Name>Foo</Name>
 </Resource>
</track>
<track>
 <Resource>
  <Name>Bar</Name>
 </Resource>
</track>
</doc>

I'd like to select the ResourceInfo node in <master> with a child <Name> matching the text value of the Name of each of the track node to get the Seq number.

I can do so directly by getting an lxml tree of each track and explicitly requesting <ResourceInfo>'s like this:

track.xpath('/doc/master/ResourceInfo/Seq[../Name[text()="Foo"]]')

But that assumes I know the name of each track and can explicitly state it ahead of time. I'd like to be able to dumbly map this and somehow replace the "Foo" in the xpath with some reference to the Name text() of current track's Resource.

It's kind of like joining tracks and resources on the text() of the Name in master with the text() of Name in each track. Is there an easy way of doing this with XPath?

I'm trying to iterate over each track, and pull the Seq from the track. Therefore, I can't explicitly ask for "Foo". I need to introspect - "Give me the Seq that is a sibling of a <Name> node in master with a value matching <Name> of the current node in <track>".

Upvotes: 1

Views: 173

Answers (2)

dlink
dlink

Reputation: 1605

After reading your comment I now understand what you are after. Uou can simply use Python to do the join:

from lxml import etree

doc = etree.parse('sample.xml')

# gather resources
resources = {}
for element in doc.xpath('/doc/master/ResourceInfo'):
    name = element[0].text
    seq  = element[1].text
    resources[name] = seq

# gather tracks
tracks = []
for element in doc.xpath('/doc/track/Resource/Name'):
    name = element.text
    tracks.append(name)

# join:

for track in tracks:
    print 'Track: %s, seq: %s' % (track, resources.get(track))

# returns: 
# Track: Foo, seq: 1
# Track: Bar, seq: 2

Previous Answer:

The XML was not well formed:

<doc>
  <master>
    <ResourceInfo>
      <Name>Foo</Name>
      <Seq>1</Seq>
    </ResourceInfo>
    <ResourceInfo>
      <Name>Bar</Name>
      <Seq>2</Seq>
    </ResourceInfo>
  </master>
  <track>
    <Resource>
      <Name>Foo</Name>
    </Resource>
  </track>  <!-- was missing backslash -->
  <track>
    <Resource>
      <Name>Bar</Name>
    </Resource>
  </track>
</doc>

Your code works:

from lxml import etree

doc = etree.parse('a.xml')

for element in doc.xpath('/doc/master/ResourceInfo/Seq[../Name[text()="Foo"]]'):
    #print etree.tostring(element)
    print element.text  

# returns
# 1

Upvotes: 1

Daniel Haley
Daniel Haley

Reputation: 52888

I'm not sure if I understand completely, but if the current context is:

/doc/track/Resource/Name

and you use the following XPath:

/doc/master/ResourceInfo[Name = current()]/Seq

you should get the Seq of the ResourceInfo of the same Name.

Upvotes: 2

Related Questions