Reputation: 5872
For quite some time, I have been receiving the connection pool timeout
and 2020-06-17T16:15:33.701Z pid=1132347 tid=goeqpzc8b WARN: PG::ConnectionBad: FATAL: sorry, too many clients already
errors while using sidekiq workers and postgres. I have tried to understand how this all simply works, but simply cannot seem to get away from these errors no matter what I adjust my settings to.
Right now, my config/database.yml
file uses these settings as its options:
default: &default
adapter: postgresql
encoding: unicode
pool: 20
port: 25060
In my config/sidekiq.yml
file, I also have the following options set:
development:
:concurrency: 20
production:
:concurrency: 20
:queues:
- default
and according to my database settings from Digital Ocean, I have a backend server connection limit of 22.
With that being said, why do the sidekiq jobs fail instead of just queueing up and waiting to run? Each of my sidekiq workers have code in it to CRUD entries in ActiveRecord, but I'm not sure how much of this matters in the actual sidekiq workers.
To my understanding, my limit with the managed database is 22 connections tops, but my settings in the database.yml
and sidekiq.yml
files are lower than the limit, so I'm not sure how or why it still fails rather than just remaining in the queue until there's room.
Given this scenario, are there suggested pool and concurrency settings that I should be using? Some of the commands in the sidekiq workers actually monitor changes in a particular record to see when it's updated, so it continuously checks the record and then proceeds once the expected information is in there.
I am looking for any suggestions or clarification on this.
Upvotes: 0
Views: 638
Reputation: 44137
PG::ConnectionBad: FATAL: sorry, too many clients already
The gap between 20 and 22 doesn't give you much margin of safety. Having pgAdmin4 open and then having some monitoring connection start up, for example (both bypassing the connection pool) would put you over the limit if the pool is trying to be fully used. If you can't increase max_connections, try reducing the pool size by a couple to give you more safety. It could make the other problem worse, but going from 2 problems to 1 problem is a type of progress.
With that being said, why do the sidekiq jobs fail instead of just queueing up and waiting to run?
I'm not a sidekiq user, but apparently they are only willing to wait for so long, and that period is being exceeded. Or at least, "connection pool timeout" is not a message generated by PostgreSQL, so it is coming from somewhere else; and based on the wording the pool manager is the obvious candidate.
Upvotes: 2