StatsViaCsh
StatsViaCsh

Reputation: 2640

Html agility pack/ xpath select child node by [index] help, please?

I've been working for a while with a node set in C#/ html agility pack, and through trial and error I have a a list of nodes that I want to loop over, getting the child nodes of each of the nodes from the loop. I'd like to refer to them by index # (seems to be the easiest, yet here I post). I've tried different ways to format the xpath, including "[0]", "/[0]", "tr/[0]", etc. Here's what I have so far, everything working fine up to the first commented line:

protected override List<IDataPoint> ReturnDataPointsFromIndividualAddressString(string AddressString)
            {
                List<IDataPoint> earningsAnnouncements = new List<IDataPoint>();

                HtmlWeb hwObject = new HtmlWeb();
                HtmlDocument htmlDoc = hwObject.Load(AddressString);

                if (htmlDoc.DocumentNode != null)
                {
                    List<HtmlNode> nodeList = new List<HtmlNode>();

                    var nodes = htmlDoc.DocumentNode.SelectNodes("html[1]/body[1]/table[4]/tr[1]/td[1]/table[1]/tr");

                    if (nodes != null)
                    {
                        foreach (HtmlNode n in nodes)
                        {
                            if (n.OuterHtml.Contains("finance.yahoo.com"))
                                    nodeList.Add(n);
                        }
                    }

                    foreach (HtmlNode node in nodeList)
                    {
                        EarningsAnnouncementDP earningsAnnouncement = new EarningsAnnouncementDP();

                        //Error: Expression must evaluate to a node set.
                        earningsAnnouncement.Company = (node.SelectSingleNode("[0]")).InnerText.ToString();
                        earningsAnnouncement.Ticker = node.SelectSingleNode("[1]").InnerText.ToString();
                        earningsAnnouncement.Estimate = node.SelectSingleNode("[2]").InnerText.ToString();
                        earningsAnnouncement.AnnouncementTime = node.SelectSingleNode("[3]").InnerText.ToString();

                        earningsAnnouncements.Add(earningsAnnouncement);
                    }

                    return earningsAnnouncements;
                }

Upvotes: 1

Views: 12427

Answers (1)

Anil Vangari
Anil Vangari

Reputation: 570

You have traversed till tr node. Now you should access td node. So you can use the XPATH like below.

node.SelectSingleNode("./td[1]").InnerText;
node.SelectSingleNode("td[1]").InnerText;

Also the first td node is accessed as td[1] and not td[0].

As Alex pointed out you can write something like below which is an excellent suggestion.

node.ChildNodes[0].InnerText

HTH

Upvotes: 5

Related Questions