emapco
emapco

Reputation: 13

Html Agility Pack get contents from table

I need to get the location, address, and phone number from "http://anytimefitness.com/find-gym/list/AL" So far I have this...

    HtmlDocument htmlDoc = new HtmlDocument();

    htmlDoc.OptionFixNestedTags = true;
    htmlDoc.LoadHtml(stateURLs[0].ToString());

    var BlankNode = 
        htmlDoc.DocumentNode.SelectNodes("/div[@class='segmentwhite']/table[@style='width: 100%;']//tr[@class='']");

    var GrayNode = 
        htmlDoc.DocumentNode.SelectNodes("/div[@class='segmentwhite']/table[@style='width: 100%;']//tr[@class='gray_bk']");

I have looked around stackoverflow for a while but none of the present post regarding htmlagilitypack has really helped. I have also have been using http://www.w3schools.com/xpath/xpath_syntax.asp

Upvotes: 0

Views: 786

Answers (2)

Garett
Garett

Reputation: 16818

Here's an example I tested in LinqPad.

string url = @"http://anytimefitness.com/find-gym/list/AL";
var client = new System.Net.WebClient();
var data = client.DownloadData(url);
var html = Encoding.UTF8.GetString(data);

var htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.OptionFixNestedTags = true;
htmlDoc.LoadHtml(html);

var gyms = htmlDoc.DocumentNode.SelectNodes("//tbody/tr[@class='' or @class='gray_bk']");
foreach (var gym in gyms) {
    var city = gym.SelectSingleNode("./td[2]").InnerText;
    var address = gym.SelectSingleNode("./td[3]").InnerText;
    var phone = gym.SelectSingleNode("./td[4]").InnerText;
}

Since the HtmlAgilityPack also supports Linq, you could also do something like:

string [] classes = {"", "gray_bk"};

var gyms = htmlDoc
        .DocumentNode
        .Descendants("tr")
        .Where(t => classes.Contains(t.Attributes["class"].Value))
        .ToList();

gyms.ForEach(gym => {
    var city = gym.SelectSingleNode("./td[2]").InnerText;
    var address = gym.SelectSingleNode("./td[3]").InnerText;
    var phone = gym.SelectSingleNode("./td[4]").InnerText;
});

Upvotes: 0

har07
har07

Reputation: 89285

Since <div> you're after is not direct child of root node, you need to use // instead of /. Then you can combine XPath for BlankNode and GrayNode using or operator, for example :

var htmlweb = new HtmlWeb();
HtmlDocument htmlDoc = htmlweb.Load("http://anytimefitness.com/find-gym/list/AL");
htmlDoc.OptionFixNestedTags = true;

var AllNode =
        htmlDoc.DocumentNode.SelectNodes("//div[@class='segmentwhite']/table//tr[@class='' or @class='gray_bk']");
foreach (HtmlNode node in AllNode)
{
    var location = node.SelectSingleNode("./td[2]").InnerText;
    var address = node.SelectSingleNode("./td[3]").InnerText;
    var phone = node.SelectSingleNode("./td[4]").InnerText;

    //do something with above informations
}

Upvotes: 1

Related Questions