Vivekh
Vivekh

Reputation: 4259

How can i extract the data that required from web page

Hi all as per the requirement i am having i would like to extract the data from this site

http://loving1.tea.state.tx.us/lonestar/Menu_dist.aspx?parameter=101902

I would like to extract the data that was presented in grid how can i can any one help me

I tried this

WebRequest request = WebRequest.Create("http://loving1.tea.state.tx.us/lonestar/Menu_dist.aspx?parameter=101902");
    WebResponse response = request.GetResponse();
    Stream data = response.GetResponseStream();
    string html = String.Empty;
    using (StreamReader sr = new StreamReader(data))
    {
        html = sr.ReadToEnd();
   }

enter image description here

The gird data i would like to extract is in the image. Please help

Upvotes: 1

Views: 1160

Answers (2)

sll
sll

Reputation: 62484

Straightforward way - download a page and parse HTML by finding out appropriate <table> tags, but in this way your "parser" has to be updated each time even HTML layout has been changed or whatever...

An other way is to leverage "Export To..." feature which is kindly provided by the site, so you can simulate HTTP request using "Export to Excel 2007 button". The idea is Excel 2007 workbooks is a zip archive with an XML data files and CSS style sheets. So you would be able to load well-formed XML data file/multiple files.

Underlying URL:

http://loving1.tea.state.tx.us/Common.Cognos/Secured/ReportViewer.aspx?reportSearchPath=/content/folder[@name='TPEIR']/folder[@name='LS']/package[@name='Districts and Schools']/report[@name='AAG5_Dist_Over']&ui.name=AAG5_Dist_Over&year=2010&district=101902&server=Loving1.tea.state.tx.us/lonestar

then download XLSX file which is ZIP archive with embedded XML files

  • xl\worksheets\Sheet1.xml
  • xl\workbook.xml

so just unzip, load XML and enjoy it...

Upvotes: 1

Svarog
Svarog

Reputation: 2208

Use WebClient.DownloadString("http://loving1.tea.state.tx.us/lonestar/Menu_dist.aspx?parameter=101902") to get the data from the server.
And than use HTMLAgilityPack to parse the html.

Upvotes: 1

Related Questions