predi
predi

Reputation: 5928

XPath condition for a whitespace separated text node

With an element like this:

<element>one two two-and-a-half three four</element>

is there a way to define a XPath 1.0 condition (evaluates to a boolean value) that would check whether the element's text node contains one or more of the whitespace separated values, such as "two" and "three", assuming that the values may appear in any order? The values may also contain parts of other values, as shown by "two" and "two-and-a-half".

This question is about an XPath coding pattern and assumes no specific programming language/tool context. For the sake of argument, you may assume that element is already the context node for the expression and that

. = 'one two two-and-a-half three four'

would therefore evaluate to true.

Upvotes: 1

Views: 1107

Answers (1)

Abel
Abel

Reputation: 57169

In XPath 1.0 it is unfortunately quite hard to deal with string manipulation in one expression, you'll probably not going to like the solution below very much. If you were able to use XPath 2.0, this becomes a simple .[tokenize(., ' ')[. = ('two', 'three', 'four')]].

XPath 1.0

Without the help of a host language like XSLT, we are stuck with repetition. However, if we are going to ignore the fact that there is not leading or trailing space, this is a possible, yet somewhat naive solution:

.[contains(., 'two ') and contains(., ' two')]

Building onto that, we can add the leading/trailing space, creating a somewhat awkward, yet workable XPath 1.0 solution:

.[contains(concat(' ', ., ' '), ' two ')]

In this expression, concat(...) will concatenate the current element's string value with a space before and after. This ensures that if we test for a given text, 'two' in the example, it will only be true if there is at least on with a leading space and one with a trailing space.

Building onto that, we can expand this further to test for multiple conditions:

.[contains(concat(' ', ., ' '), ' two ') and contains(concat(' ', ., ' '), ' three ')]

Notes

Given your remark in your original question that the focus is already on element, I started all expressions with a leading dot. Just replace that with your select expression that selects element.

Upvotes: 1

Related Questions