Reputation: 13
ruby n00b here in hope of some guidance. I am looking to scrape a website (600-odd names and links on one page) and output to CSV. The scraping itself works fine (the output correctly fills the terminal as the script runs), but I can't get the CSV to populate. The code:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'csv'
url = "http://www.example.com/page/"
page = Nokogiri::HTML(open(url))
page.css('.item').each do |item|
name = item.at_css('a').text
link = item.at_css('a')[:href]
foo = puts "#{name}"
bar = "#{link}"
CSV.open("file.csv", "wb") do |csv|
csv << [foo, bar]
end
end
puts "upload complete!"
...replacing the csv << [foo, bar]
with csv << [name, link]
just puts the final iteration into the CSV. I feel there's something basic I am missing here. Thanks for reading.
Upvotes: 0
Views: 1249
Reputation: 15664
The problem is that you're doing CSV.open
for every single item. So it's overwriting the file with the newer item. And hence at the end, you're left with the last item in the csv file.
Move the CSV.open
call before page.css('.item').each
and it should work.
CSV.open("file.csv", "wb") do |csv|
page.css('.item').each do |item|
name = item.at_css('a').text
link = item.at_css('a')[:href]
csv << [name, link]
end
end
Upvotes: 2