Kostas
Kostas

Reputation: 8595

How do I encode/decode HTML entities in Ruby?

I am trying to decode some HTML entities, such as '&amp;lt;' becoming '<'.

I have an old gem (html_helpers) but it seems to have been abandoned twice.

Any recommendations? I will need to use it in a model.

Upvotes: 231

Views: 163619

Answers (8)

Timothy Alexis Vass
Timothy Alexis Vass

Reputation: 2705

In Rails we can use: ERB::Util.html_escape and ERB::Util.url_encode.
In views, these are aliased as h and u

http://ruby-doc.org/stdlib-1.9.3/libdoc/erb/rdoc/ERB/Util.html

Upvotes: 0

memonk
memonk

Reputation: 487

To decode characters in Rails use:

<%= raw '<html>' %>

So,

<%= raw '&lt;br&gt;' %>

would output

<br>

Upvotes: 42

Henry Le
Henry Le

Reputation: 1411

I think Nokogiri gem is also a good choice. It is very stable and has a huge contributing community.

Samples:

a = Nokogiri::HTML.parse "foo&nbsp;b&auml;r"    
a.text 
=> "foo bär"

or

a = Nokogiri::HTML.parse "&iexcl;I&#39;m highly&nbsp;annoyed with character references!"
a.text
=> "¡I'm highly annoyed with character references!"

Upvotes: 58

Usman
Usman

Reputation: 1136

<% str="<h1> Test </h1>" %>

result: &lt; h1 &gt; Test &lt; /h1 &gt;

<%= CGI.unescapeHTML(str).html_safe %>

Upvotes: -4

Damien MATHIEU
Damien MATHIEU

Reputation: 32629

To encode the characters, you can use CGI.escapeHTML:

string = CGI.escapeHTML('test "escaping" <characters>')

To decode them, there is CGI.unescapeHTML:

CGI.unescapeHTML("test &quot;unescaping&quot; &lt;characters&gt;")

Of course, before that you need to include the CGI library:

require 'cgi'

And if you're in Rails, you don't need to use CGI to encode the string. There's the h method.

<%= h 'escaping <html>' %>

Upvotes: 335

Ivailo Bardarov
Ivailo Bardarov

Reputation: 3895

HTMLEntities can do it:

: jmglov@laurana; sudo gem install htmlentities
Successfully installed htmlentities-4.2.4
: jmglov@laurana;  irb
irb(main):001:0> require 'htmlentities'
=> []
irb(main):002:0> HTMLEntities.new.decode "&iexcl;I&#39;m highly&nbsp;annoyed with character references!"
=> "¡I'm highly annoyed with character references!"

Upvotes: 168

kartouch
kartouch

Reputation: 11

You can use htmlascii gem:

Htmlascii.convert string

Upvotes: 0

Jason L Perry
Jason L Perry

Reputation: 1255

If you don't want to add a new dependency just to do this (like HTMLEntities) and you're already using Hpricot, it can both escape and unescape for you. It handles much more than CGI:

Hpricot.uxs "foo&nbsp;b&auml;r"
=> "foo bär"

Upvotes: 9

Related Questions