Should html be sanitized on input?

Question

This question has already been asked here quite a few times, and most everyone agrees that raw HTML input should be stored in the database, and escaped on output. However, I think my case may be slightly different.

The user is able to input some tags (em, strong, span, etc.) but others are removed (script, style, meta, etc.) So what I'm doing is taking the raw HTML, and sending it through bleach.clean to strip (not escape) all the unsafe tags. To me this feels a lot more like validation/sanitizing versus escaping content for display. Especially since in no matter what format I serve the data (HTML, JSON or any other format) I would be stripping the unsafe tags.

Should I still be sanitizing it on output, or is this a case where it's better to do it on input?

Bonus Question:

If this is the proper approach for this scenario, what's the best way to implement it in django? Form-level validation or model-level validation?

Should html be sanitized on input?

Bonus Question:

Answers (1)

Related Questions