Markus
Markus

Reputation: 139

How to download a web page using C#

How to download a web page using C#?

Upvotes: 4

Views: 2715

Answers (5)

ValidfroM
ValidfroM

Reputation: 2837

Use WebClient class, then set the request header if the site block page spiders.

using System;
using System.Net;
using System.IO;

public class Test
{
    public static void Main (string[] args)
    {
        if (args == null || args.Length == 0)
        {
            throw new ApplicationException ("Specify the URI of the resource to retrieve.");
        }
        WebClient client = new WebClient ();

        // Add a user agent header in case the 
        // requested URI contains a query.

        client.Headers.Add ("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");

        Stream data = client.OpenRead (args[0]);
        StreamReader reader = new StreamReader (data);
        string s = reader.ReadToEnd ();
        Console.WriteLine (s);
        data.Close ();
        reader.Close ();
    }
}

Upvotes: 1

Jens Granlund
Jens Granlund

Reputation: 5070

The easiest way to download it would be what Darin Dimitrov described.

If you want all the resources for the web page, e.g. images, css.
You have to parse the html code DOM after you downloaded it.
The best way to do that seems to be with Html Agility Pack

Upvotes: 0

bendewey
bendewey

Reputation: 40265

If you doing some heavy RESTful programming with the url you may want to look into the HttpClient available with the REST Starter Kit Preview 2. With this you could do something like this:

using (var client = new HttpClient())
{
   var page = client.Get("http://example.com").EnsureStatusIsSuccessful()
                    .Content.ReadAsString();
}

Upvotes: 4

Jason Williams
Jason Williams

Reputation: 57952

Darin's answered this, but another approach just open a stream:

FileStream s = new FileStream("http://www.someplace.com/somepage.html");

...and then read as if it were a normal file.

Upvotes: 6

Darin Dimitrov
Darin Dimitrov

Reputation: 1039528

You could use WebClient:

using (var client = new WebClient())
{
    string content = client.DownloadString("http://www.google.com");
}

Upvotes: 13

Related Questions