Reputation: 2589
I am trying to scrape a web page with a lot of javascript. with the help of pguardiano i have this piece of code in ruby.
require 'rubygems'
require 'watir-webdriver'
require 'csv'
@browser = Watir::Browser.new
@browser.goto 'http://www.oddsportal.com/matches/soccer/'
CSV.open('out.csv', 'w') do |out|
@browser.trs(:class => /deactivate/).each do |tr|
out << tr.tds.map(&:text)
end
end
The scraping is done recursively in background with a sleep time of 1 hour approximatively. I have no experience of ruby and in particular of web scraping, so i have a couple of questions.
How can i avoid that every time a new firefox session is opened with a lot of cpu and ram consumption?
Is it possible to use a firefox engine without using his GUI?
Upvotes: 1
Views: 892
Reputation: 2016
You can try a headless option.
require 'watir-webdriver'
require 'headless'
headless = Headless.new
headless.start
b = Watir::Browser.start 'www.google.com'
puts b.title
b.close
headless.destroy
An alternative is to use the selenium server. A third alternative is to use a scraper like Kapow.
Upvotes: 2