Oleksandr Martynov
Oleksandr Martynov

Reputation: 330

Xpath match current node or one of child nodes

I'm having a trouble creating Xpath for following HTML:

<html>
<body>
<table class="tablesorter">
<tbody>     
    <tr class="tr_class">
                    <td>{some td info}</td>
                    <td>{some td info}</td>                    
                    <td>
                        <span class="span1">
                            <span class="span2">Out</span>
                            <span class="span3">SMTH</span>
                            <span class="span4">Out</span>
                        </span>
                    </td>   
    </tr>

    <tr class="tr_class">
                    <td>{some td info}</td>
                    <td>{some td info}</td>                    
                    <td>In</td> 
    </tr>

    <tr class="tr_class">
                    <td>{some td info}</td>
                    <td>{some td info}</td>                    
                    <td>In</td> 
    </tr>   

</tbody>
</table>
</body>
</html>

What I want is to create Xpath that will return me content of each third td node (if it doesn't have children) or a content of it's span that has class="span2". For example, for this html it should return

Out,In,In

I have Xpath that will return needed span node, it looks like:

//table[@class = 'tablesorter']//td[3]/descendant::*[@class='span2']/text()

and I have Xpath that will return me simple content of each 3d td nodes:

//table[@class = 'tablesorter']//td[3][count(descendant::*)=0]/text()

But I need only one Xpath, because for me it is necessary to have a right ordering of 'In' or 'Out' values (their ordering in the table)

Upvotes: 0

Views: 324

Answers (1)

MattH
MattH

Reputation: 38265

This will do it, no idea how robust it will be for your "corpus":

//table[@class="tablesorter"]/tbody/tr/td[3]/descendant::text()[normalize-space(.)!=""]

['Out', 'In', 'In']


Update

//table[@class="tablesorter"]/tbody/tr/td[3]/descendant::text()[normalize-space(.)!=""][parent::td or parent::span[@class="span2"]]

Upvotes: 1

Related Questions