Reputation: 4730
I'm working with Ruby on Rails and in my app there's a text area with tinyMCE in it, so users can add styles to the text, and even upload images and videos.
When these are displayed in the listings on the home page, I don't want posts' styles to be shown, nor videos/images as well.
For example, let's say I write:
How are you doing? (some image/video here)
Then, I would like to simple show in listings the following:
How are you doing? (no image/video shown)
Upvotes: 0
Views: 279
Reputation: 5899
You have several options you can use. Stripping, scrubbing HTML comes very useful also when you have to render screen scrapped web content. Your case might be simpler, as you don't "expect unexpected" content gathered from web crawls.
You can use strip_tags
which removes the tags from the string:
strip_tags("Strip <i>these</i> tags!")
# => Strip these tags!
strip_tags("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
# => Bold no more! See more here...
strip_tags("<div id='top-bar'>Welcome to my website!</div>")
# => Welcome to my website!
The sanitize
method should escape some and remove other tags. To more about rails sanitization, go here.
If you need more control check out sanitize or loofah. I prefer loofah, but sanitize might meet your tastes maybe more. Loofah is built on nokogiri, you can define fine grained rules how you want to massage the HTML or HTML fragments. Except of whitelisting, stripping you can also do scrubbing of tags:
span2div = Loofah::Scrubber.new do |node|
node.name = "div" if node.name == "span"
end
.. which changes spans to divs.
Upvotes: 3
Reputation: 4109
This always has been a difficult one to fully cover.
With your spec, I understand your wanting to remove the HTML tags that have styling.
The following example may help you.
Controllers/PostsController.rb
def index
@posts = Post.find.all
end
Helpers/PostsHelper.rb
def strip_html(content)
content.gsub(/<\/?[^>]*>/,"")
end
Views/Posts/index.html.erb
<ul>
<% @posts.each do |post| %>
<li>
<h1><a href="<%= post_path(post)%>"><%= post.title %></a></h1>
<p><%= strip_html(post.body) %></p>
</li>
<% end %>
</ul>
Upvotes: 1
Reputation: 85794
The strip_tags
helper will remove HTML tags from a string.
html = "<em>Hello!</em> <img src='/logo.png' />"
strip_html(html) # => 'Hello! '
However! While we're on the subject of allowing users to enter their own HTML, please make sure you are entirely aware of the consequences. You absolutely must be using an HTML whitelist filter to block XSS attacks. In fact, I'm not sure that, if you're allowing Flash embeds, the input can possibly be sanitized, since Flash files are allowed to run arbitrary Javascript at will.
Please be sure that you are fully aware of the issues involved before proceeding. The ability to upload arbitrary HTML should only ever be granted to a small group of extremely trusted users whom you personally know.
Upvotes: 2