Reputation: 7466
Many editors like Medium offers formatting now. From what I see in the DOM it simply adds HTML. But how do you sanitize this kind of input without losing the formatting applied by the user?
E.g. clicking bold adds:
<strong class="markup--strong markup--p-strong">text</strong>
but you wouldn't want to render if the user enters that by themselves. So how's that different? Also would that be different if you would style with markdown but also don't let users enter their own markdown but make it only accessible through the browser?
One way I could think of is, escaping every HTML special character, but that seems odd. As far as I know you sanitizer the content only when outputting it
Upvotes: 8
Views: 674
Reputation: 174
You could replace the white-listed elements with other character, for example:
<strong.*> becomes |strong|
Then you remove ALL other HTML. Be aware of onmouseover="alert(1)" so keep it really simple.
Also be careful when rendering the user input. Don't just add it as code. Instead parse it and create the elements using JavaScript. Never use innerHTML, but do use .innerText and document.createElement().
Upvotes: 1
Reputation: 299
You shold use a server side sanitizer, as stated by Vipin as client side validation is prone to be tampered. OWASP (Open Web Application Security Project) has some guides and sanitizers that you may use like the java-html-sanitizer.
For a generic brief on the concept please read this https://www.owasp.org/index.php/Data_Validation under the section Sanitize.
Upvotes: 3