Siraj S.
Siraj S.

Reputation: 3751

parse selective table rows in python with lxml and xpath

below is the structure of the html file that i wish to parse

<tr data-mod-primary="true">
    <td>'some text'
<tr>
    <td>'some text'
<tr>
    <td>'some text'
<tr data-mod-primary="true">
    <td>'some text'

I am interested in parsing only the text under <tr data-mod-primary="true"> and ignore other <tr>'s

I get all <tr> text through .xpath('//tr/td/text()') but this is not what I want. I have tried the below code after researching for solution for sometime:

.xpath('//tr[contains(@data-mod-primary="true",None)]/td/text()')

but this too gets me the text under all <tr> basically same result as .xpath('//tr/td/text()')

Any help is appreciated. thank you.

Upvotes: 0

Views: 225

Answers (1)

akuiper
akuiper

Reputation: 215127

You can use @attr=value to extract specific tr tags:

//tr[@data-mod-primary='true']/td/text()

Or if you use contains, it would be something like:

//tr[contains(@data-mod-primary, 'true')]/td/text()

Upvotes: 1

Related Questions