ewok
ewok

Reputation: 21513

XPath: get only elements with a certain subelement

I have a filesyste that is represented in an xml document in the following format:

<xml xmlns="namespace1" xmlns:ns2="namespace2">
  <entry>
    <id>123</id>
    <ns2:content name="type">directory</ns2:content>
    <ns2:content name="numErrors">3</ns2:content>
  </entry>
  ...
  <entry>
    <id>456</id>
    <ns2:content name="type">file</ns2:content>
    <ns2:content name="docState">success</ns2:content>
  </entry>
  ...
</xml>

What I need to do is, using Python's lxml, retrieve only the entry objects that represent directories. All entries contain a <ns2:content name="docState"> object, but I need to know how to retrieve a list of entry objects where that object's text is equal to directory. I can do this in several inconvenient steps, but I would rather have one query for it. Here is the way I would do it in steps:

#xml_parse.py

ns={'ns1':'namespace1','ns2':'namespace2'}
for node in tree.xpath("//ns1:entry",namespaces=ns):
    if node.find("ns2:content[@name='type']").text=="directory":
      #do stuff with node
      pass

Can anyone explain how to do this within the for statement instead of using an if?

Thanks

Upvotes: 4

Views: 2192

Answers (1)

Wayne
Wayne

Reputation: 60424

Use the following XPath expression:

//ns1:entry[ns2:content[@name='type' and .='directory']]

Upvotes: 5

Related Questions