Reputation: 4634
I was trying to use Nokogiri to turn:
<img class="img-responsive" src="img/logologo.png" alt="">
to:
<%= image_tag('img/logologo.png', :class => 'img-responsive', :alt => '') %>
Here is my code:
# a = <img class="img-responsive" src="img/logologo.png" alt="" width="256" height="256">
page = Nokogiri::HTML(a)
img = page.css('img')[0]
src = ""
alt = ""
class_atr = ""
src = img['src'] if img['src'].present?
alt = img['alt'] if img['alt'].present?
class_atr = img['class'] if img['class'].present?
result = "<%= image_tag(\'" + src + '\', :class => \'' + class_atr + '\', :alt => \'' + alt + '\')%>'
This is kind of like hard code, is there a way I can extract all attributes and its src?
The image tag might contain height
or width
parameters. How do I extract all attributes automatically and make them into ERB?
Upvotes: 1
Views: 1400
Reputation: 160551
OK, there are lots of things to work on. Let's start with how you're parsing the HTML. If all you're doing is parsing a snippet or single tag, you can use DocumentFragment to tell Nokogiri to not add the usual HTML tags:
require 'nokogiri'
doc = Nokogiri::HTML('<img class="img-responsive" src="img/logologo.png" alt="">')
doc.to_html # => "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body><img class=\"img-responsive\" src=\"img/logologo.png\" alt=\"\"></body></html>\n"
Instead, you can do:
doc = Nokogiri::HTML::DocumentFragment.parse('<img class="img-responsive" src="img/logologo.png" alt="">')
doc.to_html # => "<img class=\"img-responsive\" src=\"img/logologo.png\" alt=\"\">"
Next, don't use css
, xpath
or search
when you mean at
, at_css
or at_xpath
. Meditate on this:
doc.css('img').class # => Nokogiri::XML::NodeSet
doc.at('img').class # => Nokogiri::XML::Element
doc.css('img')[0].to_html # => "<img class=\"img-responsive\" src=\"img/logologo.png\" alt=\"\">"
doc.css('img').first.to_html # => "<img class=\"img-responsive\" src=\"img/logologo.png\" alt=\"\">"
doc.at('img').to_html # => "<img class=\"img-responsive\" src=\"img/logologo.png\" alt=\"\">"
That css
, xpath
and search
return a NodeSet is significant and something to remember. at
and its variants are equivalent to using first
or [0]
on the returned NodeSet, returning the first node, so use at
and friends if that's what you mean as it results in code that's not as noisy.
Here's how I'd go about it:
require 'nokogiri'
doc = Nokogiri::HTML::DocumentFragment.parse('<img class="img-responsive" src="img/logologo.png" alt="">')
img = doc.at('img')
img_src = img.delete('src')
img_params = img.map { |p, v| ":%s => '%s'" % [p, v] }.join(', ')
# => ":class => 'img-responsive', :alt => ''"
img_template = "<%%= image_tag('%s', %s) %%>" % [img_src, img_params]
# => "<%= image_tag('img/logologo.png', :class => 'img-responsive', :alt => '') %>"
Of course, using :k => "v"
format for key/values is old-school. I'd recommend changing to:
img_params = img.map { |p, v| "%s: '%s'" % [p, v] }.join(', ') # => "class: 'img-responsive', alt: ''"
which results in:
"<%= image_tag('img/logologo.png', class: 'img-responsive', alt: '') %>"
Upvotes: 0
Reputation: 348
Use following code to iterate over all <img>
tags inside the HTML markup and get their attributes:
page = Nokogiri::HTML <<-html
<img class="img-responsive1" src="img/logologo.png" alt="" width="256" height="256">
<a href="#">A tag</a>
<img class="img-responsive2" src="logologo222.png">
html
page.css('img').each do |img_node|
img_attributes = img_node.attributes.values # list of image attributes
# e.g., to output key-value pairs:
img_attributes.each do |attr|
p [attr.name, attr.value]
end
end
Upvotes: 2