Reputation: 89
I am having some issues download the source of a webpage, I can view the webpage fine in any browser, I can also run a web spider and download the first page no problem. Whenever I run the code to grab the source of that page I always get 403 forbidden error.
As soon as the request is sent the 403 forbidden error is returned. Anyone have any ideas?
string urlAddress = "http://www.brownells.com/";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
Stream receiveStream = response.GetResponseStream();
StreamReader readStream = null;
.................................
response.Close();
readStream.Close();
Upvotes: 1
Views: 281
Reputation: 15294
string uri = @"http://brownells.com";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
request.UserAgent = @"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36";
request.Accept = @"text/html";
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
using (Stream stream = response.GetResponseStream())
using (StreamReader reader = new StreamReader(stream))
{
Console.WriteLine (reader.ReadToEnd());
}
request.AutomaticDecompression
notifies the server that we, the client, support both gzip
and Deflate
compression schemes, so there'll be some performance gain there, however it isn't needed, the server only required that you have your UserAgent
and Accept
header set.
Remember, if you can do it in a browser, you can do it in C#, the only time you'll seriously struggle is if there's some JavaScript sorcery where the site is setting cookies using JavaScript, it's rare, but it happens.
Back to the topic at hand...
If you want to dump to a file, you need to use a filestream
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
using (Stream stream = response.GetResponseStream())
using (StreamReader reader = new StreamReader(stream))
using (TextWriter writer = new StreamWriter("filePath.html")
{
writer.Write(reader.ReadToEnd();
}
Upvotes: 2