Reputation: 6919
I have an html file with two tables and I am using HtmlAgilityPack.HtmlDocument to retrieve the data.
I tried using
htmldoc.DocumentNode.SelectNodes("//table[2]/tr")
to access the rows of the second table but I get null value. If I do
htmldoc.DocumentNode.SelectNodes("//table[1]/tr")
I get the rows of the first table just fine.
I know it does see a second table because if I try
htmldoc.DocumentNode.SelectNodes("//table")
I get count of 2
But if I do:
if (htmldoc.DocumentNode.SelectNodes("//table") != null)
{
if (htmldoc.DocumentNode.SelectNodes("//table").Count == 2)
{
var table = htmldoc.DocumentNode.SelectNodes("//table")[1];
foreach (HtmlNode row in table.SelectNodes(".//tr"))
{
}
}
}
Then I get the rows of the second table.
My question is why I could not get the correct table in one XPath expression:
htmldoc.DocumentNode.SelectNodes("//table[1]/tr")
Upvotes: 0
Views: 49
Reputation: 89325
I suspect that's because each table
resides in different parent element. In this case, //table[2]
will match each table
element that is the 2nd table in the corresponding parent element, for example :
<root>
<parent>
<table>ignored</table>
<table>this will be selected</table>
</parent>
<parent>
<table>ignored</table>
<table>this will be selected</table>
</parent>
</root>
To select the 2nd table in the whole document, you need to wrap the table selector in brackets before applying the index :
(//table)[2]/tr
Upvotes: 1