Reputation: 113
I have this piece of code:
string x = textBox1.Text;
string[] list = x.Split(';');
foreach (string u in list)
{
string url = "http://*********/index.php?n=" + u;
webBrowser1.Navigate(url);
webBrowser1.Document.GetElementsByTagName("META");
}
and I'm trying to get the <META>
tags to output to a message box, but when I test it out, I keep getting this error:
Object reference not set to an instance of an object.
Upvotes: 0
Views: 9487
Reputation: 1739
You can retrieve META tags and any other HTML element directly from your WebBrowser control, there is no need of HTML Agility Pack or other component.
Like Mark said, wait first for the DocumentCompleted event:
webBrowser.DocumentCompleted += WebBrowser_DocumentCompleted;
Then you can catch any element and content from the HTML document. The following code gets the title and the meta description:
private void WebBrowser_DocumentCompleted(object sender, System.Windows.Forms.WebBrowserDocumentCompletedEventArgs e)
{
System.Windows.Forms.WebBrowser browser = sender as System.Windows.Forms.WebBrowser;
string title = browser.Document.Title;
string description = String.Empty;
foreach (HtmlElement meta in browser.Document.GetElementsByTagName("META"))
{
if (meta.Name.ToLower() == "description")
{
description = meta.GetAttribute("content");
}
}
}
Upvotes: 0
Reputation: 225272
Your problem is that you're accessing the Document
object before the document has loaded - WebBrowser
s are asynchronous. Just parse the HTML using a library like the HTML Agility Pack.
Here's how you might get the <meta>
tags using the HTML Agility Pack. (Assumes using System.Net;
and using HtmlAgilityPack;
.)
// Create a WebClient to use to download the string:
using(WebClient wc = new WebClient()) {
// Create a document object
HtmlDocument d = new HtmlDocument();
// Download the content and parse the HTML:
d.LoadHtml(wc.DownloadString("http://stackoverflow.com/questions/10368605/getelementsbytagname-in-c-sharp/10368631#10368631"));
// Loop through all the <meta> tags:
foreach(HtmlNode metaTag in d.DocumentNode.Descendants("meta")) {
// It's a <meta> tag! Do something with it.
}
}
Upvotes: 3
Reputation: 839144
You shouldn't try to access the document until it has finish loading. Run that code inside a handler for the DocumentCompleted
event.
But Matti is right. If all you need is to read the HTML you shouldn't be using a WebBrowser
. Just fetch the text and parse it using an HTML parser.
Upvotes: 2