Reputation: 14815
I use Markdown for provide a simple way for write posts to my users in my forum script.
I'm trying to sanitize every user inputs, but I've a problem with Markdown's inputs.
I need to store in database the markdown text, not the HTML converted version, because users are allowed to edit their posts.
Basically I need something like what StackOverflow does.
I read this article about XSS vulnerability of Markdown. And the only solution I found is to use HTML_purifier before every output my script provides.
I think this can slowdown my script, I imagine output of 20 posts and running HTML_purifier for each one...
So I was trying to find a solution for sanitize from XSS vulnerabilities sanitizing the input instead of the output.
I can't run HTML_purifier on the input because my text is Markdown, not HTML. And if I convert it for get HTML I can't convert back for turn into Markdown.
I already remove (I hope) all HTML code with:
htmlspecialchars(strip_tags($text));
I've thinked about another solution:
When an user is trying to submit a new post: Convert the input from Markdown to HTML, run HTML_purifier, and if it find some XSS injection it simply return an error. But I don't know how to make this nor I know if HTML_purifier allows it.
I've found lot of questions about the same problem there, but all solutions was to store the input as HTML. I need to store as Markdown.
Someone has any advice?
Upvotes: 3
Views: 3159
Reputation: 26129
The html output of your markdown depends only on the md parser, so you can
convert your md to html, and sanitize the html after that like described here:
Upvotes: 0
Reputation: 14815
Solved...
$text = "> hello <a name=\"n\"
> href=\"javascript:alert('xss')\">*you*</a>";
$text = strip_tags($text);
$text = Markdown($text);
echo $text;
It return:
<blockquote>
<p>hello href="javascript:alert('xss')"><em>you</em></p>
</blockquote>
And not:
<blockquote>
<p>hello <a name="n" href="javascript:alert('xss')"><em>you</em></a></p>
</blockquote>
So seems that strip_tags()
does it works.
Merged with:
$text = preg_replace('/href=(\"|)javascript:/', "", $text);
The entire input should be sanitized from XSS injections. Correct me if I'm wrong.
Upvotes: 1
Reputation: 16709
javascript:
commands)// the nasty stuff :)
$content = "> hello <a name=\"n\" \n href=\"javascript:alert('xss')\">*you*</a>";
require '/path/to/markdown.php';
// at this point, the generated HTML is vulnerable to XSS
$content = Markdown($content);
require '/path/to//HTMLPurifier/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
$config->set('Core.Encoding', 'UTF-8');
$config->set('HTML.Doctype', 'XHTML 1.0 Transitional');
$config->set('Cache.DefinitionImpl', null);
// put here every tag and attribute that you want to pass through
$config->set('HTML.Allowed', 'a[href|title],blockquote[cite]');
$purifier = new HTMLPurifier($config);
// here, the javascript command is stripped off
$content = $purifier->purify($content);
print $content;
Upvotes: 7