Amadeus Pagel
Amadeus Pagel

Reputation: 75

ConnectionPool::Error (no connections are checked out)

I get this error trying to scrape a website with mechanize.

This is my code:

agent = Mechanize.new

agent.user_agent_alias = 'Mac Safari'

agent.keep_alive = false

page = agent.get('https://web.archive.org/web/20170417084732/https://www.cs.auckland.ac.nz/~andwhay/postlist.html')

page.links_with(:text => 'post').each do |link|
  post = link.click
  Article.create(
    user_id: 1,
    title: post.css('title'),
    text: post.at("//div[@itemprop = 'description']")
  )
end

I also used this code to avoid the "Too Many Connection Resets" error.

Upvotes: 3

Views: 797

Answers (1)

Jonathan Hefner
Jonathan Hefner

Reputation: 66

The code from the linked blog post seems to be incompatible with v3.0.0 of the net-http-persistent gem. Note that Mechanize v2.7.6 (the current version as of this writing) is compatible with net-http-persistent >= v2.5.2, which includes v3.0.0.

The short answer is to do one of the following:

  • Pin net-http-persistent to v2.9.4
  • (experimental) Remove the call to self.http.shutdown on line 44 of the linked blog post

The long answer is that the net-http-persistent gem started using the connection_pool gem in v3.0.0, which changed the behavior of Net::HTTP::Persistent#shutdown (aka self.http.shutdown in Mechanize::HTTP::Agent). The new behavior raises a ConnectionPool::Error ("no connections are checked out") if a request is made after shutdown has been invoked.

However, looking through the code of both net-http-persistent v2.9.4 and v3.0.0, it seems like self.http.shutdown may not have been necessary in the first place. The main purpose of shutdown seems to be invoking finish on each of the connections. In both v2.9.4 and v3.0.0, when Net::HTTP::Persistent#request rescues from an Errno::ECONNRESET exception (the original cause of all this), it retries only once and then calls Net::HTTP::Persistent#request_failed. request_failed in turn calls Net::HTTP::Persistent#finish with the connection. Thus, it seems the only necessary monkey patching is to retry more than once.

Upvotes: 4

Related Questions