user1328842
user1328842

Reputation: 85

Some issue when get html source code from internet

I wanna fetch the html source to analyze the stock information, so I use the following sample code to get html data by c# , while I compile it and run, the return value parameter result gets a string value equal to:

<html>
  <head></head>
  <body>
    <form id='submit_form' name='submit_form' method='post'
          action='http://pchome.syspower.com.tw/stock/sto0/ock2/sid2404.html'>
      <input type='hidden' name='is_check' value='1' />
    </form>
    <script type="text/javascript">
        document.getElementById('submit_form').submit();
    </script>
  </body>
</html>

(Not exact, but very similar. I've indented the data a little to make it readable)

I wanna get the the price data such as 29.15$ or each stock number like 29.20$-->364, 29.15$->174, but there isn't any data in the return value.

Could someone give me some suggestion to solve this issue? Thank you very much:)

string urlAddress = "http://pchome.syspower.com.tw/stock/sto0/ock2/sid2404.html";
private void button1_Click(object sender, EventArgs e)
{
    WebRequest myRequest = WebRequest.Create(urlAddress);
    myRequest.Method= "GET";
    WebResponse myResponse =myRequest.GetResponse();
    StreamReader sr = new StreamReader(myResponse.GetResponseStream());
    string result =sr.ReadToEnd();
    sr.Close();
    myResponse.Close();
}

Upvotes: 0

Views: 115

Answers (1)

master131
master131

Reputation: 431

The website automatically redirects each time you visit the page. In order to get around this, you need to submit the hidden field as specified in the page's source. I just tested this and it works:

string urlAddress = "http://pchome.syspower.com.tw/stock/sto0/ock2/sid2404.html";
var request = (HttpWebRequest) WebRequest.Create(urlAddress);
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = 10;
var requestStream = request.GetRequestStream();
requestStream.Write(Encoding.UTF8.GetBytes("is_check=1"), 0, 10);
requestStream.Close();
var response = (HttpWebResponse) request.GetResponse();
var sr = new StreamReader(response.GetResponseStream());
string result = sr.ReadToEnd();
sr.Close();
response.Close();

All the stock data is stored in the page source so you can parse it using regular expressions.

Upvotes: 1

Related Questions