Reputation: 89383
Users can edit "articles" in my application. Each article is mastered in the DB and sent to the client as Markdown -- I convert it to HTML client side with Javascript.
I'm doing this so that when the user wants to edit the article he can edit and POST the Markdown right back to the server (since it's already on the page).
My question is how to sanitize the Markdown I send to the client -- can I just use Rails' sanitize
helper?
Also, any thoughts on this approach in general? Another strategy I thought of was rendering and sanitizing the HTML on the server, and pulling the Markdown to the client only if the user wants to edit the article.
Upvotes: 7
Views: 2753
Reputation: 17246
The other answers here are good, but let me make a few suggestions on sanitization. Rails built-in sanitizer is decent, but it doesn't guarantee well-formedness which tends to be half the problem. It's also fairly likely to be exploited since it's not best-of-breed and it has a large large install footprint for hackers to attack.
I believe the best and most forward-looking sanitization around today is html5lib because it's written to parse as a browser does, and it's a collaboration by a lot of leaders in the field. However it's a bit on the slow side and not very Ruby like.
In Ruby I recommend either Loofah which lifts some of the html5 sanitization stuff verbatim, but uses Nokogiri and runs much much faster or Sanitize which has a solid test suite and very good configurability (don't shoot yourself in the foot though).
I just released a plugin called ActsAsSanitiled which is a rewrite of ActsAsTextiled to automagically sanitize the textiled output as well using the Sanitize gem. It's designed to give you the best of both worlds: input is untouched in the DB, yet the field always outputs safe HTML without needing to remember anything in the template. I don't use Markdown myself, but I would consider adding BlueCloth support.
Upvotes: 4
Reputation: 22016
I follow a couple principals:
That leads me to the alternative architecture you suggest:
This has been my approach and it works out pretty cleanly.
Upvotes: 4
Reputation: 20747
I haven't used Markdown in Rails, but my approach would be to take the submitted Markdown and store it, as well as an HTML rendered and sanitized copy of it, in the database. That way you're not throwing any information away in your sanitization, and you're not having to re-render the Markdown every time you want to display an article.
Rails' sanitize helper should do the job. There are also a number of plugins (such as xss_shield and xss_terminate) which can be used to whitelist your output, just to make sure you don't forget to sanitize!
Upvotes: 0