CraigP
CraigP

Reputation: 453

Perl XPath: use "and" to return two nodes

<pets>
  <dog>woof</dog>
  <cow>moo</cow>
  <bird>chirp</bird>
</pets>

I'm using XML::LibXML, version 1.0, and I want to only return the nodes if both XPath expressions (conditions) are met.

If I use the vertical or bar | , this XPath expression will work:

//dog[.="woof"] | //bird[.="chirp"] <-- returns both nodes

...but since it's an "or", if only one matches, it will return the 1 matching node.

e.g. - //dog[.="xwoof"] | //bird[.="chirp"] <-- will return just matching bird node.

It's all or nothing - I want to only return if both match, like this:

//dog[.="woof"] and //bird[.="chirp"]

...but it doesn't accept an "and" condition, nor &.

What would be the XPath syntax to return both nodes for both conditions to be met?

I want only the matching dog and bird elements, and no mention of pets.

Upvotes: 1

Views: 139

Answers (3)

ikegami
ikegami

Reputation: 385917

//*[ dog="woof" and bird="chirp" ]/*[ self::dog="woof" or self::bird="chirp" ]

or

//dog[ .="woof" and ../bird="chirp" ] |
//bird[ .="chirp" and ../dog="woof" ]

Option 1 - Look down approach

Let's start by assuming you want to return the pets element. For that, we can use the following:

//*[ dog="woof" and bird="chirp" ]

Results

[...] can be thought of WHERE clause. The above returns pets elements WHERE dog="woof" and bird="chirp" is true.

The following would also work (WHERE dog="woof" is true, and WHERE bird="chirp" is true):

//*[dog="woof"][bird="chirp"]

Results

We can customize this to return other thing by adding to it.

//*[ dog="woof" and bird="chirp" ]/( dog[.="woof"] | bird[.="chirp"] )

Results

As you can see, | is the union operator which combines two sets of nodes.

One problem. That syntax is not supported by XML::LibXML since libxml2 only supports XPath 1.0. A small adjustment can be made to compensate.

//*[ dog="woof" and bird="chirp" ]/*[ self::dog="woof" or self::bird="chirp" ]

Results


Option 2 - Look up/around approach

Another way to phrase the previous search is: the dogs and birds that have the other as a sibling.

//dog[ .="woof" and ../bird="chirp" ] |
//bird[ .="chirp" and ../dog="woof" ]

Results


[Simplified dog[.="woof"] to dog="woof" where possible, as per @kjhughes answer.]

Upvotes: 4

kjhughes
kjhughes

Reputation: 111621

This XPath,

//*[dog="woof" and bird="chirp"]/*[self::dog="woof" or self::bird="chirp"]

will select all dog and bird elements whose string values are "woof" and "chirp", respectively, and that exist within a common parent element:

<dog>woof</dog>
<bird>chirp</bird>

Credit: @ikegami's help in interpreting OP's intent was indispensable. Be sure to +1 his answer here.


Side note: | is a union operator (of node-sets in XPath 1.0 and node sequences in XPath 2.0). It is not logical OR.


Upvotes: 4

choroba
choroba

Reputation: 241918

It's not clear what should be returned if there are several dogs and a bird (or vice versa).

You can first check that there are at least two nodes satisfying the conditions, then select them:

//*[dog[.="woof"]][bird[.="chirp"]]/*[name()="dog" and .="woof" or name()="bird" and .="chirp"]

(tested in xsh which is great for playing with XPath expressions)

Upvotes: 2

Related Questions