SriniShine
SriniShine

Reputation: 1139

Using HtmlAgilityPack to get the last <tr> of a html table

I have a html table stucture. I need to get the value of the value of the first <td> in the final <tr> tag. Here is my table structure. The value I require from the below function getFinalNodeValue is "3".

 <table id="test">
            <tr>
                <td>ID</td>
                <td>Name</td>
                <td>Age</td>
            </tr>
            <tr>
                <td>1</td>
                <td>Yasoda</td>
                <td>21</td>
            </tr>

            <tr>
                <td>2</td>
                <td>Samantha</td>
                <td>25</td>
            </tr>

            <tr>
                <td>3</td>
                <td>Sajee</td>
                <td>26</td>
            </tr>

        </table>


Here is the code I wrote using HtmlAgilityPack.

 public String getFinalNodeValue(String URL)
        {
            var webGet = new HtmlWeb();
            var pageSource = webGet.Load(URL);

            var table = pageSource.DocumentNode.SelectSingleNode("//table[@id='test']//tr[1]");


            string id = null;


            IEnumerable<HtmlNode> trNodes = table.DescendantsAndSelf();

            foreach (var currentItem in trNodes)
            {
                if (currentItem == trNodes.Last())
                {
                    IEnumerable<HtmlNode> tdNodes = currentItem.Descendants();

                    foreach (var x in tdNodes)
                    {
                        if(x == tdNodes.First())
            {
                id = x.InnerText;
            }
            else
            {
                break;
            }
                    }

                }
                else
                {
                    continue;
                }
            }

            return id;

        }

The method doesn't return a value. Any help is highly appreciated.

Upvotes: 2

Views: 4394

Answers (3)

Simon Mourier
Simon Mourier

Reputation: 139177

This should do it:

    HtmlDocument doc = new HtmlDocument();
    doc.Load(MyHtmlFile);

    HtmlNode node = doc.DocumentNode.SelectSingleNode("//table[@id='test']/tr[last()]/td");
    Console.WriteLine(node.InnerText);

Note the usage of the XPATH last() function

Upvotes: 5

jason
jason

Reputation: 3615

if you change your table like this:

<table id="test" runat="server">

You can iterate over it in the codebehind, like this:

HtmlTable myTable = this.test;
int rowCount = myTable.Rows.Count;
HtmlTableCell td = myTable.Rows(rowCount - 1).Cells(0);
string val = td.InnerText;

Upvotes: 0

Oded
Oded

Reputation: 499212

The XPath you use to populate the table variable - "//table[@id='test']//tr[1]", selects the second TR element, no the table.

This most likely should just be "//table[@id='test']".

At this point, to fetch the descendant TR nodes into the trNodes variable, you should use:

IEnumerable<HtmlNode> trNodes = table.SelectNodes("tr");

Upvotes: 1

Related Questions