user1494162
user1494162

Reputation: 336

xpath select child nodes whose parent aren't the same type

I apologize for the probably misleading title, I didn't know how to phrase it.

I have a huge xml file with lots of elements in it and I need to get a certain element (with the name w:r) but only if that element isn't inside another element with the name w:r

e.g.:

<w:r>
    test
</w:r>

should select one element

<w:r>
    <w:r>
        test
    </w:r>
</w:r>

should also select only ONE element (the outer one) and not two.

My current solution is: //*[local-name()='r'], but it selects two elements for the second example (one being the outer element and the other being the inner element)

Upvotes: 0

Views: 716

Answers (1)

zx485
zx485

Reputation: 29052

You can extract the outer w:r elements with the following XPath expression:

//*[local-name()='r' and not(parent::*[local-name()='r'])]

For the following XML (for testing purposes):

<?xml version='1.0' encoding='utf-8'?>
<root xmlns:w="xxx">
    <w:r t="c">
        test
    </w:r>  
    <w:r t="d">
        <w:r t="h">
            test
        </w:r>
    </w:r>
    <w:r t="e">
        <a>
            <b>
                <c>...
                    <w:r t="i">Something</w:r>
                    ...
                </c>
            </b>
        </a>
    </w:r>
</root>

The output is:

<w:r xmlns:w="xxx" t="c"/>  
<w:r xmlns:w="xxx" t="d"/>
<w:r xmlns:w="xxx" t="e"/>

This means that all the outer w:r elements are selected by the expression.


If you like to take all parents into account and not just the direct parent, you can use the ancestor:: axis like this:

//*[local-name()='r' and not(ancestor::*[local-name()='r'])]

For the example XML, the result is the same, but the semantics are different.

Upvotes: 1

Related Questions