Reputation: 2521
I have the following HTML:
<html>
<body>
<h1>Foo</h1>
<p>The quick brown fox.</p>
<h1>Bar</h1>
<p>Jumps over the lazy dog.</p>
</body>
</html>
I'd like to change it into the following HTML:
<html>
<body>
<p class="title">Foo</p>
<p>The quick brown fox.</p>
<p class="title">Bar</p>
<p>Jumps over the lazy dog.</p>
</body>
</html>
How can I find and replace certain HTML tags? I can use the Nokogiri gem.
Upvotes: 15
Views: 12221
Reputation: 369624
#!/usr/bin/env ruby
require 'rubygems'
gem 'nokogiri', '~> 1.2.1'
require 'nokogiri'
doc = Nokogiri::HTML.parse <<-HERE
<html>
<body>
<h1>Foo</h1>
<p>The quick brown fox.</p>
<h1>Bar</h1>
<p>Jumps over the lazy dog.</p>
</body>
</html>
HERE
doc.search('h1').each do |heading|
heading.name = 'p'
heading['class'] = 'title'
end
puts doc.to_html
Upvotes: 8
Reputation: 6345
Seems like this works right:
require 'rubygems'
require 'nokogiri'
markup = Nokogiri::HTML.parse(<<-somehtml)
<html>
<body>
<h1>Foo</h1>
<p>The quick brown fox.</p>
<h1>Bar</h1>
<p>Jumps over the lazy dog.</p>
</body>
</html>
somehtml
markup.css('h1').each do |el|
el.name = 'p'
el.set_attribute('class','title')
end
puts markup.to_html
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html><body>
# >> <p class="title">Foo</p>
# >> <p>The quick brown fox.</p>
# >> <p class="title">Bar</p>
# >> <p>Jumps over the lazy dog.</p>
# >> </body></html>
Upvotes: 17
Reputation: 511
Try this:
require 'nokogiri'
html_text = "<html><body><h1>Foo</h1><p>The quick brown fox.</p><h1>Bar</h1><p>Jumps over the lazy dog.</p></body></html>"
frag = Nokogiri::HTML(html_text)
frag.xpath("//h1").each { |div| div.name= "p"; div.set_attribute("class" , "title") }
Upvotes: 19