Reputation: 223
This question has already been asked here quite a few times, and most everyone agrees that raw HTML input should be stored in the database, and escaped on output. However, I think my case may be slightly different.
The user is able to input some tags (em, strong, span, etc.) but others are removed (script, style, meta, etc.)
So what I'm doing is taking the raw HTML, and sending it through bleach.clean
to strip (not escape) all the unsafe tags. To me this feels a lot more like validation/sanitizing versus escaping content for display. Especially since in no matter what format I serve the data (HTML, JSON or any other format) I would be stripping the unsafe tags.
Should I still be sanitizing it on output, or is this a case where it's better to do it on input?
If this is the proper approach for this scenario, what's the best way to implement it in django? Form-level validation or model-level validation?
Upvotes: 2
Views: 854
Reputation: 1687
When it comes to sanitization there is no such thing as too much. You should always sanitize ALL users inputs PRIOR to them being inserted into the database both to sanitize the HTML itself as well as to protect against any SQL injection attacks. It doesn't hurt to run an additional check and sanitize the output when it is going from the database and being put into the web page.
Upvotes: 1