bukzor
bukzor

Reputation: 38462

XPath: Search for several nodes in specific order

I have a XML file with "hello" nodes containing "word" nodes:

<doc>
    <hello>
        <word>Hello</word><word>World</word><word>!</word>
    </hello>
    <hello>
        <word>Hello</word><word>!</word><word>World</word>
    </hello>
    <hello>
        <word>Hello</word><word>World</word><word>!</word><word>blorf</word>
    </hello>
    <hello>
        <word>Hello</word><word>Wo</word><word>rld!</word>
    </hello>
</doc>

I want to match the only the first hello. The second one has the wrong order, and the third one has an extra word. The fourth has the right text, but divided into words incorrectly.


This query works in XPath 1.0 but is extremely wordy. Is there a simpler way?

//hello[count(word) = 3 and word[1] = "Hello" and word[2] = "World" and word[3] = "!"]

This works in XPath 2.0. Is there any way to do equivalent in XPath 1.0?

//hello[deep-equal(data(subsequence(word,1)),('Hello','World','!'))]

Upvotes: 1

Views: 225

Answers (3)

Daniel Haley
Daniel Haley

Reputation: 52858

If you're using XPath 2.0, you can use string-join() to add a delimiter to separate the individual words.

//hello[string-join(word,'|')='Hello|World|!']

You may need to use normalize-space(word) if white-space is supposed to be ignored.

Another XPath 2.0 alternative is to use deep-equal() to compare two sequences. This would be safer because it's not using a delimiter that might be part of the text value.

//hello[deep-equal(data(subsequence(word,1)),('Hello','World','!'))]

Upvotes: 1

Arup Rakshit
Arup Rakshit

Reputation: 118261

You can use the below XPATH 1.0

//hello[
  word[1][
    .='Hello' and following-sibling::word[1][
      .='World' and following-sibling::word[1][
        .='!' and count(following-sibling::word)=0
      ]
    ]
  ]
]

output

<hello>
    <word>Hello</word><word>World</word><word>!</word>
</hello>

Upvotes: 1

BeniBela
BeniBela

Reputation: 16907

Just treat the entire Hello-node as text:

//hello[normalize-space(.) = "HelloWorld!"]

Upvotes: 1

Related Questions