Reputation: 9516
Python-Markdown includes features like escaping of raw HTML that are obviously intended to make it safe on untrusted input, and generally speaking Markdown is commonly used for rendering user input, such as right here on SO.
But is this implementation really trustworthy? Has anyone here studied it to decide it's safe to run on arbitrary input?
I see there is eg Markdown in Django XSS safe and Secure Python Markdown Library but is 'safe' mode really safe?
Upvotes: 10
Views: 2092
Reputation: 984
You can use bleach
import bleach
text = "<a href='https://example.com'>Example</a><script>alert('message');</script>"
sanitized_text = bleach.clean(text,
tags=['p','a','code','pre','blockquote'],
attributes={'code': ['class'],'a': ['href']}
)
Read documentation for more.
Upvotes: 0
Reputation: 3604
The Python Markdown library appears to be safe as far as anyone knows, if you use it properly. See the link for details about how to use it safely, but the short version is: it is important to use the latest version, to set safe_mode
, and to set enable_attributes=False
.
Update: safe_mode
is now due to be deprecated, because of the security problems with it. See https://github.com/Python-Markdown/markdown/commit/7db56daedf8a6006222f55eeeab748e7789fba89. Instead, use a separate HTML sanitizer, such as HTML Purifier.
Upvotes: 5