user897210
user897210

Reputation: 109

Xpath: Complex expression to include some nodes and exclude others

Example xml:

<foo>
    <bar name="bar1">
    </bar>
    <bar name="bar2">
    </bar>
</foo>
<qux>
    <foo>
        <bar name="bar3">
        </bar>
    </foo>
     <bar name="bar4">
     </bar>
</qux>

What is the expression to select all bar elements that are children of the root foo (bar1, bar2, bar4) but not the nested foo (bar3)?

Thank you in advance!

Upvotes: 1

Views: 925

Answers (2)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243579

What is the expression to select all bar elements that are children of the root foo (bar1, bar2, bar4) but not the nested foo (bar3)?

Here is probably one of the simplest and shortest XPath expressions that when evaluated on any well-formed XML document that has a top element foo and may have any level of nested foo elements selects exactly the bar elements whose only foo ancestor is the top element:

//bar[not(ancestor::foo[2])]

This Xpath expression selects any bar element in the document that has less than two foo ancestors. Because by definition the top element is a foo, this means that every bar has this top element foo as an ancestor. If it is within a nested foo, it has at least a second ancestor foo and will not be selected by the above XPath expression, because in this case boolean(ancestor::foo[2]) is true()

XSLT - based verification:

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:copy-of select=
   "//bar[not(ancestor::foo[2])]"/>
 </xsl:template>
</xsl:stylesheet>

when applied on the following XML document (based on the provided XML fragment, but making it a well-formed XML document and adding a slightly more nesting/complexity, to make this interesting):

<foo>
    <bar name="bar1">
    </bar>
    <bar name="bar2">
    </bar>
    <qux>
        <foo>
           <baz>
             <bar name="bar3">
             </bar>
            </baz>
        </foo>
        <bar name="bar4">
        </bar>
        <qux>
            <foo>
                <bar name="bar5">
                </bar>
            </foo>
            <bar name="bar6">
            </bar>
        </qux>
    </qux>
</foo>

outputs exactly the wanted elements:

<bar name="bar1">

</bar>
<bar name="bar2">

</bar>
<bar name="bar4">

</bar>
<bar name="bar6">

</bar>

Upvotes: 2

harpo
harpo

Reputation: 43208

As @Cheeso said, the document is invalid, and it doesn't seem to jive with your question.

If this is the document you meant (where qux is inside the first foo)

<foo>
    <bar name="bar1">
    </bar>
    <bar name="bar2">
    </bar>
    <qux>
        <foo>
            <bar name="bar3">
            </bar>
        </foo>
        <bar name="bar4">
        </bar>
    </qux>
</foo>

then here are two paths

//bar[not(parent::foo[ancestor::foo])]
//bar[1 >= count(ancestor::foo)]

that will select the elements you want (tested in .NET).

Upvotes: 1

Related Questions