Coding Duchess
Coding Duchess

Reputation: 6919

Strange occurrence with XPath expression when using HtmlAgilityPack

I have an html file with two tables and I am using HtmlAgilityPack.HtmlDocument to retrieve the data.

I tried using

htmldoc.DocumentNode.SelectNodes("//table[2]/tr")

to access the rows of the second table but I get null value. If I do

htmldoc.DocumentNode.SelectNodes("//table[1]/tr")

I get the rows of the first table just fine.

I know it does see a second table because if I try

htmldoc.DocumentNode.SelectNodes("//table")

I get count of 2

But if I do:

 if (htmldoc.DocumentNode.SelectNodes("//table") != null)
               {
                   if (htmldoc.DocumentNode.SelectNodes("//table").Count == 2)
                   {
                       var table = htmldoc.DocumentNode.SelectNodes("//table")[1];
                       foreach (HtmlNode row in table.SelectNodes(".//tr"))
                       {

                       }
                   }
               }

Then I get the rows of the second table.

My question is why I could not get the correct table in one XPath expression:

htmldoc.DocumentNode.SelectNodes("//table[1]/tr")

Upvotes: 0

Views: 49

Answers (1)

har07
har07

Reputation: 89325

I suspect that's because each table resides in different parent element. In this case, //table[2] will match each table element that is the 2nd table in the corresponding parent element, for example :

<root>
    <parent>
        <table>ignored</table>
        <table>this will be selected</table>
    </parent>
    <parent>
        <table>ignored</table>
        <table>this will be selected</table>
    </parent>
</root>

To select the 2nd table in the whole document, you need to wrap the table selector in brackets before applying the index :

(//table)[2]/tr

xpathtester.com demo

Upvotes: 1

Related Questions