newton
newton

Reputation: 51

How to get a HTML element by text using XPath?

I'm encoutered a problem that is could not get a HTML element by element's text.My HTML looks like:

...
<table>
  ...
  <tr>
    ...
    <td class="oMain">test value</td>
    ...
  <tr>
  ...
</table>
...

For some special reasons,I have to get the '<td class="oMain">' element using it's text 'test value'. I tried '//tr[td='test value']/td' but no result.How can i write the XPath expression?

Any help is welcome.Thanks!

Upvotes: 1

Views: 4354

Answers (4)

user3518765
user3518765

Reputation: 31

In the xpath expression, first put the element node, which in your case is td, and then apply the filter text()='text node'

//td[text()='test value']

Hope this helps.

Upvotes: 2

Dennis M&#252;nkle
Dennis M&#252;nkle

Reputation: 5071

Your XPath expression seems to be correct. Do you have a default namespace (e.g. XHTML) in your html? If so, you can modify your XPath like this:

//*[local-name()='td' and text()='test value']

If you can figure out how to use namespaces, you could also do

//xhtml:tr[xhtml:td='test value']/xhtml:td

Does that help?

Upvotes: 1

FK82
FK82

Reputation: 5075

Your expression

//tr[td='test value']/td

places the predicate on the parent node "tr". Maybe that's what's causing the problem.

What you want probably is this

//td[@class = "oMain" and child::text() = 'test value']]

Here's a link to th W3 specification of the xPath language for further reading: http://www.w3.org/TR/xpath/

Upvotes: 1

jekozyra
jekozyra

Reputation: 5

What are you using to do the parsing? In Ruby + Hpricot, you can do

doc.search("//td.oMain").each do |cell|
  if cell.inner_html == "test value"
    return cell
  end
end

In this case, cell would be:

<td class="oMain">test value</td>

Upvotes: 0

Related Questions