karthik k
karthik k

Reputation: 3981

Read data in a HTML data

I have downloaded the HTML data from a website using webclient class. Now I want to read the data in between tags. I came to know about htmlagilitypack, but I don't want to use it. I am using the following code to get the HTML data.

WebClient client = new WebClient();
        string url = "XXXXXXXXXXXXX" 
        Byte[] requestedHTML; 
        requestedHTML = client.DownloadData(url);
        string htmlcode = client.DownloadString(url);

        //client.DownloadFile(url, @"E:\test.html");

        UTF8Encoding objUTF8 = new UTF8Encoding();
        string html = objUTF8.GetString(requestedHTML);
        Response.Write(html);

Upvotes: 1

Views: 2857

Answers (2)

BreakHead
BreakHead

Reputation: 10672

Try This:

        WebClient client = new WebClient();
        string url = "Your URL";
        Byte[] requestedHTML;
        requestedHTML = client.DownloadData(url);
        string htmlcode = client.DownloadString(url);

        //client.DownloadFile(url, @"E:\test.html");

        UTF8Encoding objUTF8 = new UTF8Encoding();
        string html = objUTF8.GetString(requestedHTML);           


        MatchCollection m1 = Regex.Matches(html, @"(<h3>(.*?)</h3>)",
        RegexOptions.Singleline);

        foreach (Match m in m1)
        {
            string cell = m.Groups[1].Value;
            Match match = Regex.Match(cell, @"<h3>(.+?)</h3>");
            if (match.Success)
            {
                string value = match.Groups[1].Value;
            }
        }

The string value will give you the value = "Chicago"

Upvotes: 1

Rosmarine Popcorn
Rosmarine Popcorn

Reputation: 10967

Use Regular Expressions instead.

Upvotes: 3

Related Questions