tibbon
tibbon

Reputation: 1028

Resque worker causing ActiveRecord::StatementInvalid: PG::Error: SSL SYSCALL error: EOF detected

I've had this problem on two apps now. Heroku themselves haven't been able to be much help (yet).

I am using:

Everything works great when running with my local Postgres and Redis databases.

And here's an example of the the errors in my Heroku log:

2013-01-01T21:17:27+00:00 app[resque_worker.1]: Found job on #<Resque::Queue:0x00000006652920>
2013-01-01T21:17:27+00:00 app[resque_worker.1]: got: (Job{facebook} | FacebookRefresh | ["facebook_key"])
2013-01-01T21:17:27+00:00 app[resque_worker.1]: Running before_fork hooks with [(Job{facebook} | FacebookRefresh | ["facebook_key"])]
2013-01-01T21:17:27+00:00 app[resque_worker.1]: Running after_fork hooks with [(Job{facebook} | FacebookRefresh | ["facebook_key"])]
2013-01-01T21:17:27+00:00 app[resque_worker.1]: resque-2.0.0.pre.1: Processing facebook since 1357075047
2013-01-01T21:17:27+00:00 app[resque_worker.1]: resque-2.0.0.pre.1: Forked 503 at 1357075047
2013-01-01T21:17:27+00:00 app[resque_worker.1]: Running before_perform hooks with [(Job{facebook} | FacebookRefresh | ["facebook_key"])]
2013-01-01T21:17:27+00:00 app[resque_worker.1]: :             SELECT a.attname, format_type(a.atttypid, a.atttypmod),
2013-01-01T21:17:27+00:00 app[resque_worker.1]:               FROM pg_attribute a LEFT JOIN pg_attrdef d
2013-01-01T21:17:27+00:00 app[resque_worker.1]:                 ON a.attrelid = d.adrelid AND a.attnum = d.adnum
2013-01-01T21:17:27+00:00 app[resque_worker.1]:                      pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod
2013-01-01T21:17:27+00:00 app[resque_worker.1]: ). Retrying...
2013-01-01T21:17:27+00:00 app[resque_worker.1]: Performing FacebookRefresh caused an exception (PG::Error: SSL SYSCALL error: EOF detected
2013-01-01T21:17:27+00:00 app[resque_worker.1]:              WHERE a.attrelid = '"facebook_accounts"'::regclass
2013-01-01T21:17:27+00:00 app[resque_worker.1]:              ORDER BY a.attnum
2013-01-01T21:17:27
+00:00 app[resque_worker.1]:                AND a.attnum > 0 AND NOT a.attisdropped
2013-01-01T21:17:27+00:00 app[resque_worker.1]: :             SELECT a.attname, format_type(a.atttypid, a.atttypmod),
2013-01-01T21:17:27+00:00 app[resque_worker.1]:               FROM pg_attribute a LEFT JOIN pg_attrdef d
2013-01-01T21:17:27+00:00 app[resque_worker.1]:                 ON a.attrelid = d.adrelid AND a.attnum = d.adnum
2013-01-01T21:17:27+00:00 app[resque_worker.1]: (Job{facebook} | FacebookRefresh | ["facebook_key"]) failed: #<ActiveRecord::StatementInvalid: PG::Error: SSL SYSCALL error: EOF detected
2013-01-01T21:17:27+00:00 app[resque_worker.1]: >
2013-01-01T21:17:27+00:00 app[resque_worker.1]:                      pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod
2013-01-01T21:17:27+00:00 app[resque_worker.1]:                AND a.attnum > 0 AND NOT a.attisdropped
2013-01-01T21:17:27+00:00 app[resque_worker.1]:              WHERE a.attrelid = '"facebook_accounts"'::regclass
2013-01-01T21:17:27+00:00 app[resque_worker.1]:              ORDER BY a.attnum
2013-01-01T21:17:27+00:00 app[resque_worker.1]: Running before_fork hooks with [(Job{facebook} | FacebookRefresh | ["facebook_key"])]

I've tried so many before_hook and after_hook things in my unicorn's config file, but none of them seem to help.

# What the timeout for killing busy workers is, in seconds
timeout 60

# Whether the app should be pre-loaded
preload_app true

# How many worker processes
worker_processes 3

before_fork do |server, worker|
  # Replace with MongoDB or whatever
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.connection.disconnect!
    Rails.logger.info('Disconnected from ActiveRecord')
  end

  # If you are using Redis but not Resque, change this
  if defined?(Resque)
    Resque.redis.quit
    Rails.logger.info('Disconnected from Redis')
  end

  sleep 1
end

after_fork do |server, worker|
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.establish_connection
    Rails.logger.info('Connected to ActiveRecord')
  end

  if defined?(Resque)
    Resque.redis = ENV['OPENREDIS_URL'] || 'redis://localhost:6379'
    Rails.logger.info('Connected to Redis')
  end
end

And my Procfile

web: bundle exec unicorn -c lib/unicorn/config.rb -p $PORT
resque_scheduler: env bundle exec rake resque:scheduler
resque_worker: env QUEUE=* bundle exec rake environment resque:work

So one thing I'm wondering is that my resque_worker doesn't use the Unicorn config at all, and since its running on a completely separate Heroku worker, I'm unsure if there's any way it would know about that stuff. The web instnace is just fine, as is the scheduler. Its just the resque_worker that's blowing up on every postgres call.

I'm not making any particularly crazy database calls from the worker. An example on might be:

def queue_users_for_refresh
  FacebookAccount.all.each do |x|
    Resque.enqueue(FacebookAccountRefresh, x.username)
  end
end

And another one later (in the FacebookAccountRefresh) that is:

FacebookAccount.where(:username => user).first

Upvotes: 1

Views: 3719

Answers (1)

Winfield
Winfield

Reputation: 19145

This looks like an error caused by sharing database connections across processes incorrectly. This happens when the Resque worker isn't re-initializing database connections after forking.

Do you have a Resque initializer? It looks like you're missing after_fork directives for Resque workers to match the ones in your Unicorn app server workers.

Add/edit your Resque initializer file (ie: config/initializers/resque.rb):

Resque.after_fork = Proc.new { ActiveRecord::Base.establish_connection }

Upvotes: 5

Related Questions