Reputation: 2906
I need to remove newlines from any object/embed tags. I am currently attempting to do so using Nokogiri by doing the following:
s = "<div>
<object height='450' width='600'>
<param name='allowfullscreen' value='true'>
<param name='allowscriptaccess' value='always'>
<param name='movie' value='http://vimeo.com/moogaloop.swf?clip_id=3317924&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1'>
<embed src='http://vimeo.com/moogaloop.swf?clip_id=3317924&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1' type='application/x-shockwave-flash' allowfullscreen='true' allowscriptaccess='always' height='450' width='600'>
</embed>
</object>
</div>"
doc = Nokogiri::HTML(s)
doc.css('object').each { |o| o.inner_html.gsub!(/\n/, ""); puts o.inner_html }
Please note that the example is for object tags only.
Printing o.inner_html at the end of the block shows that no replacement has occurred, even though the gsub text appears correct. Also, once that part is resolved, I need to make sure that the actual object node in the doc object is saved with the updated values.
Any help is most appreciated. Thanks.
Upvotes: 3
Views: 4230
Reputation: 303215
Got it!
require 'nokogiri'
s = <<ENDHTML
<div>
<object height='450' width='600'>
<param name='allowfullscreen' value='true'><param name='allowscriptaccess' value='always'>
<param name='movie' value='http://vimeo.com/moogaloop.swf?clip_id=3317924&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1'>
<embed src='http://vimeo.com/moogaloop.swf?clip_id=3317924&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1' type='application/x-shockwave-flash' allowfullscreen='true' allowscriptaccess='always' height='450' width='600'>
</embed>
</object>
</div>
ENDHTML
doc = Nokogiri::HTML(s)
doc.css('object,embed').each{ |e| e.inner_html = e.inner_html.gsub(/\n/,'') }
puts doc.serialize( save_with: 0 )
#=> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
#=> <html><body><div>
#=> <object height="450" width="600"><param name="allowfullscreen" value="true"><param name="allowscriptaccess" value="always"><param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=3317924&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1"><embed src="http://vimeo.com/moogaloop.swf?clip_id=3317924&server=vimeo.com&show_title=1&show_byline=1&show_portrait=0&color=&fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" height="450" width="600"></embed></object>
#=> </div></body></html>
inner_html
.inner_html.gsub!
is not the same as inner_html = inner_html.gsub
.serialize
with the hash :save_with => 0
passed in to prevent Nokogiri from generating newlines between tags in the output.Upvotes: 8