Reputation: 5228
I am using the code from this post: Get HTML code from website in C#
to save the html in a string
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
Stream receiveStream = response.GetResponseStream();
StreamReader readStream;
if (response.CharacterSet == null)
readStream = new StreamReader(receiveStream);
else
readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet));
string data = readStream.ReadToEnd();
response.Close();
readStream.Close();
msgBox.Text = data;
}
However the page I am trying to read has a temporary loader page, how can I get around this that it tries to save the html again after this page is actually loaded?
Best regards
Upvotes: 0
Views: 1870
Reputation: 1
why don't you use webbrowser and make delay with
await Task.Delay(n)
Upvotes: 0
Reputation: 218798
the page I am trying to read has a temporary loader page
It all depends on what that means and how that "temporary loader page" works. For example, if that page is (whether from JavaScript code or some HTML META redirect) making a request to the destination page, than that request is what you need to capture. Currently you're reading from a given URL:
(HttpWebRequest)WebRequest.Create(url)
This is essentially making a GET request to that URL and reading the response. But based on your description it sounds like that's the wrong URL. It sounds like there's a second URL which contains the actual information you're looking for.
Given that, you essentially have two options:
url
in your code.url
value, and make a second request to the new URL.Clearly the first option is a lot easier. The second is only necessary if that second URL changes with each visit or is expected to change frequently over time. If that's the case then you'd have to basically reverse-engineer how the website is performing the second request so you can perform it as well.
Web scraping can get complicated pretty quickly, and often turns into a game of cat and mouse (even unintentionally and mutually unaware) between the person scraping the content and the person hosting the content (who might not want it to be scraped).
Upvotes: 2