How to do white-listed HTML encoding in ASP.NET MVC 5?

Question

I'm working on a mini-CMS module for one of my projects, where users are allowed to edit content in markdown. I'm using markdown-it for parsing and showing a preview.

I was thinking a lot about how to send the input to the server, and also how to store it in the database. I came to a conclusion to avoid duplicating the markdown parsing at server-side, and send both markdown and the parsed HTML to the server. I think nowadays the added overhead is minimal, even on a site where edits are heavy.

So at final stage I still need to validate the HTML sent to the server, as it can be a security bottleneck of the system. I've read a lot about Microsoft's implementation of AntiXSS, and how it is (or was) quite unusable for such scenarios, as it was too gready. For example I've found this article with even a helper code (using HTMLAgilityPack) to give a usable sanitizing implementation.

Unfortunately I haven't found anything newer than 2013 on this topic. I'd like to ask at present how to do a proper HTML encoding where there are allowed tags and attributes, but still safe from any kind of XSS attacks? Is such a code like in the article still needed, or are there any built-in solutions?

Also, if my choice of client-side markdown parsing is not viable, what are some other options? What I want to avoid, is duplicating all kinds of markdown logic at both client and server. For example I've prepared several custom extensions for markdown-it, etc.

Zolt&#225;n Tam&#225;si · Accepted Answer

I ended up using the code in the article. I made an important change, I removed style attributes from the whitelist completely. I don't need them, the styling which I allow can be achieved by classes. Also, style attributes are also dangerous and hard to encode/escape properly. Now I feel that the code is safe enough for my current purposes.

How to do white-listed HTML encoding in ASP.NET MVC 5?

Answers (2)

Related Questions