Reputation: 537
{Yup, the above more or less explains it} :)
Regex oRegex = new Regex("<body.*?>(.*?)</body>", RegexOptions.Multiline);
The above doesnt seem to work if the body has any attributes in it.
Upvotes: 2
Views: 2735
Reputation: 537
I solved it eventually by using RegexOptions.Singleline
instead of using RegexOptions.Multiline
Upvotes: 1
Reputation: 1062770
With the HTML Agility Pack (assuming it is html, not xhtml):
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
string body = doc.DocumentNode.SelectSingleNode("/html/body").InnerHtml;
Upvotes: 10
Reputation: 60418
Don't use a regular expression. Use something that's meant to parse XML/HTML:
XmlDocument.SelectSingleNode("//body").InnerXml;
Load your string into an XmlDocument, use the SelectSingleNode function (which takes an XPath expression as a parameter), then extract what you need from the resulting XmlNode.
Upvotes: 4