Can't find node using HTMLAgilityPack

Question

I have used the code sample from following video: https://youtu.be/8e3Wklc1H_A

The code looks like this

var webGet = new HtmlWeb();
var doc = webGet.Load("http://pastebin.com/raw.php?i=gF0DG08s");

HtmlNode OurNone = doc.DocumentNode.SelectSingleNode("//div[@id='footertext']");

if (OurNone != null)
    richTextBox1.Text = OurNone.InnerHtml;
else
    richTextBox1.Text = "nothing found";

I thought at first that the original website might be down already (www.fuchsonline.com) so I quickly made a HTML which has only a footer in it and pasted it on Pastebin (link in code above)





                 
                     Copyright © FUCHS Online Ltd, 2013. All Rights Reserved.

When using the Pastebin link in the code the program always writes "nothing found" into the richTextBox. However, the website used in the video is still up so I tried using the website in the webGet and voila - it works.

Now I'd like to ask what exactly is wrong with each of the codes. Is the HTML missing something or is the program only made for complete websites and if yes, what does make a website complete?

inspiredcoder · Accepted Answer

In this instance you are simply saving raw html to the this page as string which is why it is returning empty. If you really wanted to parse this with HTML agility pack you could first download the page, grab the raw HTML, and parse it into the agility pack's document model.

        WebRequest webRequest = HttpWebRequest.Create("http://pastebin.com/raw.php?i=gF0DG08s");
        webRequest.Method = "GET";
        string pageSource;
        using (StreamReader reader = new StreamReader(webRequest.GetResponse().GetResponseStream()))
        {
            pageSource = reader.ReadToEnd();
            HtmlDocument html = new HtmlDocument();
            html.LoadHtml(pageSource);
            HtmlNode OurNone = html.DocumentNode.SelectSingleNode("//div[@id='footertext']");
            if (OurNone != null)
            {
                richTextBox1.Text = OurNone.InnerHtml;
            }
            else
            {
                richTextBox1.Text = "nothing found";
            }
        }

Can't find node using HTMLAgilityPack

Answers (2)

Related Questions

Can&#39;t find node using HTMLAgilityPack

Answers (2)

Related Questions

Can't find node using HTMLAgilityPack