Ori S
Ori S

Reputation: 165

WebClient Headers

I am using WebClient to scraping google search. all the time I getting "Cannot reach this page" until I changed the User-Agent Header:

            string page = string.Format("https://www.google.com/search?q={0}&hl=en", my_stocks[order].Symbole+" stock");
            WebClient client = new WebClient ();
            client.Headers["User-Agent"] = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)";
            string r = client.DownloadString(page);

but the html presented differently from when I searching the same thing in my chrome. so I tried change the header to the same when I use chrome with https://www.whatismybrowser.com/detect/what-is-my-user-agent but getting "Cannot reach this page" again. What am I missing here?

Upvotes: 1

Views: 943

Answers (1)

James Woodall
James Woodall

Reputation: 735

My 2 cents ...

Since the influx of Single-Page-Applications, web scraping isn't what it used to be as pages are generally not server-side rendered any more.

It's highly likely that a Google Search is delivered using asynchronous REST queries, rather than a server-side rendered page.

Watch the Network trace in your Chrome tab when you do a Google search and you'll likely see many different network requests.

I suggest that you look for a more specific API to deal with the type of request that you're looking to make.

Upvotes: 1

Related Questions