asim
asim

Reputation: 29

Given a website's HTML in a string, how to extract tag elements?

HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create("http://www.home.com");
myRequest.Method = "GET";
WebResponse myResponse = myRequest.GetResponse();
StreamReader sr = new StreamReader(myResponse.GetResponseStream(), 
                                   System.Text.Encoding.UTF8);
string result = sr.ReadToEnd();
sr.Close();
myResponse.Close();

The string contains whole html of that webpage, now I want to extract html tags from that string.

How do I that?

Upvotes: 2

Views: 534

Answers (1)

Davita
Davita

Reputation: 9114

Having Html Agility Pack makes it peace of cake parsing HTML content. You can see examples here.

HtmlDocument doc = new HtmlDocument();
 doc.Load("file.htm");
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
 {
    HtmlAttribute att = link["href"];
    att.Value = FixLink(att);
 }
 doc.Save("file.htm");

Upvotes: 6

Related Questions