HtmlAgilityPack cannot find node

Question

I am trying to get Start called span from Here

Chrome gives me this xPath: //*[@id="guide-pages"]/div[2]/div[1]/div/div[1]/div/div/div[2]/div/div[3]/div[2]/div[1]/h2

But HtmlAgilityPack returns null, after I tried remove them one by one; this works: //*[@id="guide-pages"]/div[2]/div[1] , but not the rest of them.

My full Code:

HtmlDocument doc = new HtmlDocument();
var text = await ReadUrl();
doc.LoadHtml(text);
Console.WriteLine($"Getting Data From: {doc.DocumentNode.SelectSingleNode("//head/title").InnerText}"); //Works fine
Console.WriteLine(doc.DocumentNode.SelectSingleNode("//*[@id='guide-pages']/div[2]/div[1]/div/div[1]/div/div/div[2]/div/div[3]/div[2]/div[1]/h2") == null);

Output:

Getting Data From: Miss Fortune Build Guide : [7.11] KOREAN MF Build - Destroy the Carry! [Added Support] :: League of Legends Strategy Builds
True

Hung Cao · Accepted Answer

Don't use xpath from Chrome. Use LINQ in HtmlAgilityPack instead. For example .Descendants("div") will give you all the div under 1 html node. Each html node will have meta data like id, attributes(classes...), and you can query your wanted div from there. This is one handy method to check if a HtmlNode has classes or not.

    public static bool HasClass(this HtmlNode node, params string[] classValueArray)
    {
        var classValue = node.GetAttributeValue("class", "");
        var classValues = classValue.Split(' ');
        return classValueArray.All(c => classValues.Contains(c));
    }

HtmlAgilityPack cannot find node

Answers (1)

Related Questions