Steve
Steve

Reputation: 3080

Handling Failed Connection on Proxy for Nokogiri

I have the following line of code which I use to scrape the html for a site. As you can see I pass in a proxy into this

doc = Nokogiri::HTML(open(Scrape.scrape_url + page.to_s, :proxy => 'http://177.19.134.66:8080'))

Some times these proxys go down and I then get the error

A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. - connect(2)

I am very new to ruby but what I would like to do is create a list of Proxy IP addresses. and then get it to attempt to scrape using the first. and if its fails then try the next one until there is none left to check...

How would I go about creating a list and then handling the error?

Upvotes: 1

Views: 900

Answers (1)

pguardiario
pguardiario

Reputation: 54984

simplest would be:

['http://localhost:8080','http://localhost:8888','http://localhost:8000'].each do |proxy|
  break if @doc = Nokogiri::HTML(open(Scrape.scrape_url + page.to_s, :proxy => proxy)) rescue nil
end

Notice '@doc' because 'doc' will go out of scope when the loop ends.

Upvotes: 3

Related Questions