JammyStressford
JammyStressford

Reputation: 13

web scraping/export to CSV with Ruby

ruby n00b here in hope of some guidance. I am looking to scrape a website (600-odd names and links on one page) and output to CSV. The scraping itself works fine (the output correctly fills the terminal as the script runs), but I can't get the CSV to populate. The code:

   require 'rubygems'
   require 'nokogiri'   
   require 'open-uri'
   require 'csv'

   url = "http://www.example.com/page/"
   page = Nokogiri::HTML(open(url))

   page.css('.item').each do |item|
     name = item.at_css('a').text
     link = item.at_css('a')[:href]
     foo = puts "#{name}"
     bar = "#{link}"

     CSV.open("file.csv", "wb") do |csv|
       csv << [foo, bar]
     end
   end

   puts "upload complete!"

...replacing the csv << [foo, bar] with csv << [name, link] just puts the final iteration into the CSV. I feel there's something basic I am missing here. Thanks for reading.

Upvotes: 0

Views: 1249

Answers (1)

Chirantan
Chirantan

Reputation: 15664

The problem is that you're doing CSV.open for every single item. So it's overwriting the file with the newer item. And hence at the end, you're left with the last item in the csv file.

Move the CSV.open call before page.css('.item').each and it should work.

CSV.open("file.csv", "wb") do |csv|
  page.css('.item').each do |item|
    name = item.at_css('a').text
    link = item.at_css('a')[:href]
    csv << [name, link]
  end
end

Upvotes: 2

Related Questions