Yasser Shaikh
Yasser Shaikh

Reputation: 47794

How to use Html Agility Pack for HTML validations

I am using HTML Agility Pack for validating my html. Below is what I am using,

public class MarkupErrors
{
    public string ErrorCode { get; set; }
    public string ErrorReason { get; set; }
}

public static List<MarkupErrors> IsMarkupValid(string html)
{
    var document = new HtmlAgilityPack.HtmlDocument();
    document.OptionFixNestedTags = true;
    document.LoadHtml(html);

    var parserErrors = new List<MarkupErrors>();
    foreach(var error in document.ParseErrors)
    {
        parserErrors.Add(new MarkupErrors
                             {
                                 ErrorCode = error.Code.ToString(),
                                 ErrorReason = error.Reason
                             });
    }

    return parserErrors;
}

So say my input is something like the one shown below :

<h1>Test</h1> 
Hello World</h2> 
<h3>Missing close h3 tag

So my current function return a list of following errors

- Start tag <h2> was not found
- End tag </h3> was not found

which is fine...

My problem is that I want the entire html to be valid, that is with a proper <head> and <body> tags, because this html will later be available for preview, download as .html files.

So I was wondering if I could check for this using HTML Agility Pack ?

Any ideas or other options will be appreciated. Thanks

Upvotes: 4

Views: 3412

Answers (1)

Simon Mourier
Simon Mourier

Reputation: 139095

You can check there is a HEAD element or a BODY element under an HTML element like this for example:

bool hasHead = doc.DocumentNode.SelectSingleNode("html/head") != null;
bool hasBody = doc.DocumentNode.SelectSingleNode("html/body") != null;

These would fail if there is no HTML element, or if there is no BODY element under the HTML element.

Note I don't use this kind of XPATH expression "//head" because it would give a result even if the head was not directly under the HTML element.

Upvotes: 6

Related Questions