Zamblek
Zamblek

Reputation: 809

xpath query with regex

it's very simple there is an HTML file and there is a div with variable id like that

<div id="abc_1"><div>

the integer part of the id is variable so it could be abc_892, abc_553 ...etc

what is the best query to get that ?

Upvotes: 4

Views: 4333

Answers (2)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243479

The currently accepted answer selects such unwanted elements as:

<div id="abc_xyz"/>

But only such div elements must be accepted, whose id not only starts with "abc_" but the substring following the _ is a representation of an integer.

Use this XPath expression:

//div
   [@id[starts-with(., 'abc_') 
      and 
        floor(substring-after(.,'_')) 
       = 
        number(substring-after(.,'_')) 
       ]
   ]

This selects any div element that has an id attribute whose string value starts with the string "abc_" and the substring after the - is a valid representation of an integer.

Explanation:

Here we are using the fact that in XPath 1.0 this XPath expression:

floor($x) = number($x)

evaluates to true() exactly when $x is an integer.

This can be proven easily:

  1. If $x is an integer the above expression evaluates to true() by definition.

  2. If the above expression evaluates to true(), this means that neither of the two sides of the equality are NaN, because by definition NaN isn't equal to any value (including itself). But then this means that $x is a number (number($x) isnt NaN) and by definition, a number $x that is equal to the integer floor($x) is an integer.

Alternative solution:

//div
   [@id[starts-with(., 'abc_') 
      and 
        'abc_' = translate(., '0123456789', '')
       ]
   ]

Upvotes: 2

goat
goat

Reputation: 31813

//div[starts-with(@id, "abc_")]

Upvotes: 6

Related Questions