user1839169
user1839169

Reputation: 207

Get the content of an element of a Web page using C#

Is there any way to get the content of an element or control of an open web page in a browser from a c# app?

I tried to get the window ex, but I don't know how to use it after to have any sort of communication with it. I also tried this code:

using (var client = new WebClient())
{
    var contents = client.DownloadString("http://www.google.com");
    Console.WriteLine(contents);
}

This code gives me a lot of data I can't use.

Upvotes: 4

Views: 7897

Answers (1)

Darin Dimitrov
Darin Dimitrov

Reputation: 1038710

You could use an HTML parser such as HTML Agility Pack to extract the information you are interested in from the HTML you downloaded:

using (var client = new WebClient())
{
    // Download the HTML
    string html = client.DownloadString("http://www.google.com");

    // Now feed it to HTML Agility Pack:
    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml(html);

    // Now you could query the DOM. For example you could extract
    // all href attributes from all anchors:
    foreach(HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]"))
    {
        HtmlAttribute href = link.Attributes["href"];
        if (href != null)
        {
            Console.WriteLine(href.Value);
        }
    }
}

Upvotes: 6

Related Questions