Reputation: 139
How to download a web page using C#?
Upvotes: 4
Views: 2715
Reputation: 2837
Use WebClient class, then set the request header if the site block page spiders.
using System;
using System.Net;
using System.IO;
public class Test
{
public static void Main (string[] args)
{
if (args == null || args.Length == 0)
{
throw new ApplicationException ("Specify the URI of the resource to retrieve.");
}
WebClient client = new WebClient ();
// Add a user agent header in case the
// requested URI contains a query.
client.Headers.Add ("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
Stream data = client.OpenRead (args[0]);
StreamReader reader = new StreamReader (data);
string s = reader.ReadToEnd ();
Console.WriteLine (s);
data.Close ();
reader.Close ();
}
}
Upvotes: 1
Reputation: 5070
The easiest way to download it would be what Darin Dimitrov described.
If you want all the resources for the web page, e.g. images, css.
You have to parse the html code DOM after you downloaded it.
The best way to do that seems to be with Html Agility Pack
Upvotes: 0
Reputation: 40265
If you doing some heavy RESTful programming with the url you may want to look into the HttpClient available with the REST Starter Kit Preview 2. With this you could do something like this:
using (var client = new HttpClient())
{
var page = client.Get("http://example.com").EnsureStatusIsSuccessful()
.Content.ReadAsString();
}
Upvotes: 4
Reputation: 57952
Darin's answered this, but another approach just open a stream:
FileStream s = new FileStream("http://www.someplace.com/somepage.html");
...and then read as if it were a normal file.
Upvotes: 6
Reputation: 1039528
You could use WebClient:
using (var client = new WebClient())
{
string content = client.DownloadString("http://www.google.com");
}
Upvotes: 13