Reputation: 4710
I work on a web application that uses Markdown as its syntax, the only issue I am facing is how to validate the user input on the server side so that it is actually Markdown and not some XSS attack that could be injected using a POST request or by disabling javascript.
I know StackOverflow does this but how do they do it and allow certain HTML tags including images that are prone to XSS attacks? Any open source package that can help (examples appreciated).
Becaue I heard that StackOverflow uses it, I will be trying out Pagedown as client side validator.
Upvotes: 3
Views: 519
Reputation: 197564
You need to invest ca. one to two weeks of proper coding and get some tagsoup parser / handler finsihed that can sanitze the incomming HTML (via Markdown).
I highly suggest a three pass validation and processing scheme:
You can then output. Store both, the Markdown source and the "backed" HTML data so you don't need to do this for every display operation.
Upvotes: 3
Reputation: 943108
Markdown allows arbitrary HTML to be included in it. Since this includes <script>
elements, you can have valid Markdown that is also an XSS attack.
Run the incoming data through a Markdown parser to get HTML, then treat it like any other user submitted HTML (pass it through an HTML parser that applies a whitelist to the elements and attributes).
Upvotes: 2