Reputation: 17
I am executing the following code to extract all the links of the page using htmlagilitypack. When I enter the URL https://htmlagilitypack.codeplex.com/ I don't get any error and the code works fine. The URLs are also extracted and displayed well. But if I enter any other URL like https://htmlagilitypack.codeplex.com/discussions/12447 , then I get the following error "Object reference not set to an instance of an object". I am getting error in this line
OutputLabel.Text += counter + ". " + aTag.InnerHtml + " - " +
aTag.Attributes["href"].Value + "\t" + "<br />";
Please help me out. It may be minor mistake for you but Please dont mark it negative.
var getHtmlWeb = new HtmlWeb();
var document = getHtmlWeb.Load(InputTextBox.Text);
var aTags = document.DocumentNode.SelectNodes("//a");
int counter = 1;
if (aTags != null)
{
foreach (var aTag in aTags)
{
OutputLabel.Text += counter + ". " + aTag.InnerHtml + " - " +
aTag.Attributes["href"].Value + "\t" + "<br />";
counter++;
}
}
Upvotes: 1
Views: 489
Reputation: 236268
Looks like some of anchors does not have href attribute. E.g. in given page there is anchor:
<a name="post40566"></a>
So, aTag.Attributes["href"]
returns null
and you have an exception when you are trying to get this attribute value. You can change XPath to select only those anchors which have this attribute:
document.DocumentNode.SelectNodes("//a[@href]");
Or verify if attribute exists before accessing its value:
if (aTag.Attributes["href"] != null)
// ...
Third option is usage of GetAttributeValue
method and provide some default value which would be displayed for missing attributes:
aTag.GetAttributeValue("href", "N/A")
Upvotes: 4