Erba Aitbayev
Erba Aitbayev

Reputation: 4333

Extract data from table cells and ignore specific child tags with Xpath?

Having this html table:

<table class="info">
<tbody>
    <tr><td class="name">Year</td><td>2011</td></tr>
    <tr><td class="name">Area</td><td>45 m<sup>2</sup></td></tr>     
    <tr><td class="name">Condition</td><td>Renovated</td></tr>
</tbody>
</table>

I am trying to extract data from 2-nd cell in each row (it is: 2011, 45 m, Renovated)

I use this Xpath expression:

//table[@class="info"]//td[2]//text()

Received output (wrong):

2011
45 m
2
Renovated

Desired output:

2011
45 m
Renovated

As you can see, from the 2-nd row I received value that is enclosed in <sup> tags. I want to exclude this value. I know that instead of my current Xpath code I can use this one (removed 1 slash in the end):

//table[@class="info"]//td[2]/text()

It will solve problem, but I need to exclude this specific <sup> tag inside <td>. Because sometimes I have some tags inside <td> that I do not want to exclude.

So, I want to get data from 2-nd cell in each row and exclude value in <sup> tags

Upvotes: 1

Views: 931

Answers (1)

alecxe
alecxe

Reputation: 473863

For every tr get the second td and get the /text() (single slash) to avoid getting the element children texts. Worked for me:

//table[@class="info"]//tr/td[2]/text()

Prints:

2011
45 m
Renovated

Or, if you want to exclude sup element only:

//table[@class="info"]//tr/td[2]//text()[not(parent::sup)]

Upvotes: 1

Related Questions