Reputation: 69
I currently have a piece of code which will grab a product title, description, and price and for that it works great. However, I also need it to get the image URL which is where my dilemma is. I tried using a xpath inside the loop I have at the bottom and it lists out ALL the images that are equal to 220 on EVERY product which I dont want at all. So basically I get something like this....
product 1 Title here
product 1 Description here
product 1 price here
http://www.test.com/product1.jpg
http://www.test.com/product2.jpg
http://www.test.com/product3.jpg
http://www.test.com/product4.jpg
product 2 Title here
product 2 Description here
product 2 price here
http://www.test.com/product1.jpg
http://www.test.com/product2.jpg
http://www.test.com/product3.jpg
http://www.test.com/product4.jpg
Where as I obviously want product 1 to just have http://www.test.com/product1.jpg and product 2 to have http://www.test.com/product2.jpg etc, etc. The images are just in a div tag with no class or ID hence why I didnt just easily put them into a css selector. Im really new to ruby/nokogiri so any help would be great.
require 'nokogiri'
require 'open-uri'
url = "http://thewebsitehere"
data = Nokogiri::HTML(open(url))
products = data.css('.item')
products.each do |product|
puts product.at_css('.vproduct_list_title').text.strip
puts product.at_css('.vproduct_list_descr').text.strip
puts product.at_css('.price-value').text.strip
puts product.xpath('//img[@width = 220]/@src').map {|a| a.value }
end
Upvotes: 0
Views: 1032
Reputation: 6419
Try changing:
puts product.xpath('//img[@width = 220]/@src').map {|a| a.value }
to:
puts product.xpath('.//img[@width = 220]/@src').map {|a| a.value }
The point of the '.' there is to say you want all images that are children of the current node (e.g. so you're not peeking at product 2's images).
Upvotes: 2
Reputation: 54674
File#basename
will return only the filename:
File.basename('http://www.test.com/product4.jpg')
#=> "product4.jpg"
So you probably want something like this:
puts product.xpath('//img[@width = 220]/@src').map {|a| File.basename(a.value) }
Upvotes: 0