MasterJoe
MasterJoe

Reputation: 2325

How to use XPath contains() for specific text?

Say we have an HTML table which basically looks like this:

2|1|28|9|
3|8|5|10|
18|9|8|0|

I want to select the cells which contain only 8 and nothing else, that is, only 2nd cell of row2 and 3rd cell of row3.

This is what I tried: //table//td[contains(.,'8')]. It gives me all cells which contain 8. So, I get unwanted values 28 and 18 as well.

How do I fix this?

EDIT: Here is a sample table if you want to try your xpath. Use the calendar on the left side-https://sfbay.craigslist.org/sfc/

Upvotes: 8

Views: 38635

Answers (2)

kjhughes
kjhughes

Reputation: 111491

Be careful of the contains() function.

It is a common mistake to use it to test if an element contains a value. What it really does is test if a string contains a substring. So, td[contains(.,'8')] takes the string value of td (.) and tests if it contains any '8' substrings. This might be what you want, but often it is not.

This XPath,

//td[.='8']

will select all td elements whose string-value equals 8.

Alternatively, this XPath,

//td[normalize-space()='8']

will select all td elements whose normalize-space() string-value equals 8. (The normalize-space() XPath function strips leading and trailing whitespace and replaces sequences of whitespace characters with a single space.)

Notes:

  • Both will work even if the 8 is inside of another element such as a a, b, span, div, etc.
  • Both will not match <td>gr8t</td>, <td>123456789</td>, etc.
  • Using normalize-space() will ignore leading or trailing whitespace surrounding the 8.

See also:

Upvotes: 14

jedifans
jedifans

Reputation: 2297

Try the following xpath, which will select the whole text contents rather than partial matches:

//table//td[text()='8']

Edit: Your example HTML has a tags inside the td elements, so the following will work:

//table//td/a[text()="8"]

See example in php here: https://3v4l.org/56SBn

Upvotes: 6

Related Questions