Marianne
Marianne

Reputation: 69

Scraping and storing in CSV in Ruby

I am a Ruby-newbie and I tried my first scraper today. It's a scraper designed to store recipes in a CSV file. Nevertheless, I can't figure out why it doesn't work. here is my code:

recipe.rb :

require 'csv'
require 'nokogiri'
require 'open-uri'


def write_csv(ingredient)

doc = Nokogiri::HTML(open("http://www.marmiton.org/recettes/recherche.aspx?aqt=#{ingredient}"), nil, 'utf-8')
  doc.search(".m_contenu_resultat").first(10).each do |item|
    name = item.search('.m_titre_resultat a').text
    description = item.search('.m_texte_resultat').text
    cooking_time = item.search('.m_detail_time').text
    diff = item.search('.m_detail_recette').text.split('-')
    difficulty = diff[2]
    recipes = [name, description, cooking_time, difficulty]
    CSV.open('recueil.csv', 'wb') do |csv|
      csv << recipes
    end
  end
end

write_csv('chocolat')

Thank you so much for your answers, it'll help me a lot !

Upvotes: 3

Views: 441

Answers (3)

Marianne
Marianne

Reputation: 69

IT WORKED ! I changed my code as below, using a hash :

require 'csv'
require 'nokogiri'
require 'open-uri'


def write_csv(ingredient)
recipes= []
doc = Nokogiri::HTML(open("http://www.marmiton.org/recettes/recherche.aspx?aqt=#{ingredient}"), nil, 'utf-8')
  doc.search(".m_contenu_resultat").first(10).each do |item|
    name = item.search('.m_titre_resultat a').text
    description = item.search('.m_texte_resultat').text
    cooking_time = item.search('.m_detail_time').text
    diff = item.search('.m_detail_recette').text.split('-')
    difficulty = diff[2]
    recipes << {
    name: name,
    description: description,
    difficulty: difficulty
  }
end

  CSV.open('recueil.csv','a') do |csv|
  csv << ["name", "description", "cooking_time", "difficulty"]
  recipes.each do |recipe|
    csv << [
      recipe[:name],
      recipe[:description],
      recipe[:cooking_time],
      recipe[:difficulty]
    ]
  end
end
end

write_csv('chocolat')

Upvotes: 1

peter
peter

Reputation: 42192

You don't specify what doesn't work, what the result of the errors are, so I must speculate.

I tried your script and had difficulties with the encoding, since the site is in french, there are lots of special characters.

Try again with this at the head of your script, it should solve at least that problem.

# encoding: utf-8
Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8

Upvotes: 0

Lasse Sviland
Lasse Sviland

Reputation: 1517

When you are opening your CSV file you are overwriting the previous one every time. You should eighter append to the file like this:

CSV.open('recueil.csv', 'a') do |csv|

or you could open it before you start looping like this:

def write_csv(ingredient)
  doc = Nokogiri::HTML(open("http://www.marmiton.org/recettes/recherche.aspx?aqt=#{ingredient}"), nil, 'utf-8')
  csv = CSV.open('recueil.csv', 'wb')
  doc.search(".m_contenu_resultat").first(10).each do |item|
    name = item.search('.m_titre_resultat a').text
    description = item.search('.m_texte_resultat').text
    cooking_time = item.search('.m_detail_time').text
    diff = item.search('.m_detail_recette').text.split('-')
    difficulty = diff[2]
    recipes = [name, description, cooking_time, difficulty]
    csv << recipes
  end
  csv.close
end

Upvotes: 0

Related Questions