Reputation: 5761
I'd like to know one (or more) ways to parse the HTML page output. I'd like to detect some patterns on the HTML that will be send to the client and log some info if present.
Upvotes: 1
Views: 430
Reputation: 21864
Everything you need is in the
Page.Render
method, override it and do what you want to in there.
protected override void Render(HtmlTextWriter writer)
{
// do your stuff here
StringBuilder stringBuilder = new StringBuilder();
StringWriter stringWriter = new StringWriter(stringBuilder);
HtmlTextWriter htmlTextWriter = new HtmlTextWriter(stringWriter);
base.Render(htmlTextWriter); // <-- render the page into the htmlTextwriter
// the htmlTextwriter connects trough the stringWriter to the stringBuilder
string theHtml = stringBuilder.ToString(); // <---- html captured in string
//---------------------------------------------
//do stuff on theHtml here
//---------------------------------------------
writer.Write(theHtml); // <----write html with the original writer
}
Upvotes: 2
Reputation: 144122
It depends on what you mean by "parse" exactly, but something like the HTML Agility Pack can create an XML-like structure from an HTML document - essentially creating a proper HTML DOM data structure. You can even then convert it straight to XML, use LINQ, etc.
Upvotes: 1