Reputation: 213
Assume that we have the following HTML strings.
string A = " <table width=325><tr><td width=325>test</td></tr></table>"
string B = " <<table width=325><tr><td width=325>test</td></table>"
How can we validate A or B in C# according to HTML specifications?
A should return true whereas B should return false.
Upvotes: 6
Views: 17748
Reputation: 5921
For this specific case you can use HTML Agility Pack to assert if the HTML is well formed or if you have tags not opened.
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(
"WAVEFORM</u> YES, <u>NEGATIVE AUSCULTATION OF EPIGASTRUM</u> YES,");
foreach (var error in htmlDoc.ParseErrors)
{
// Prints: TagNotOpened
Console.WriteLine(error.Code);
// Prints: Start tag <u> was not found
Console.WriteLine(error.Reason);
}
Checking a HTML string for unopened tags
Upvotes: 15
Reputation: 5773
Github link: https://github.com/markbeaton/TidyManaged
This guy has written a .NET wrapper for HTMLTidy. I haven't used it but it may be what you are looking for.
Upvotes: 0
Reputation: 3768
One point to start with is checking if it's valid XML.
by the way, I think both your examples are incorrect as you've left out the </tr>
from both.
Upvotes: 1