SQLAlchemy connection pooling, what are checked out connections?

I have a single threaded process that is running a bunch of queries using SQLAlchemy. After a while, it starts either progressing very slowly or I start getting the following error:

(<class 'sqlalchemy.exc.TimeoutError'>): QueuePool limit of size 10 overflow 0 reached, connection timed out, timeout 30 (Background on this error at: http://sqlalche.me/e/3o7r)

I've read those docs. Since I don't have concurrent connections it seems that this should only happen if I'm not calling session.close(), which I verified that I am.

I added some logging to try to understand what's going on. I'm regularly logging session.get_bind().pool.status(), this is a sample of what I see:

size: 10  Connections in pool: 0 Current Overflow: -9 Current Checked out connections: 1
size: 10  Connections in pool: 4 Current Overflow: -2 Current Checked out connections: 4
size: 10  Connections in pool: 4 Current Overflow: 0 Current Checked out connections: 6
size: 10  Connections in pool: 3 Current Overflow: 0 Current Checked out connections: 7
size: 10  Connections in pool: 4 Current Overflow: 0 Current Checked out connections: 6
size: 10  Connections in pool: 3 Current Overflow: 0 Current Checked out connections: 7
size: 10  Connections in pool: 2 Current Overflow: 0 Current Checked out connections: 8
size: 10  Connections in pool: 1 Current Overflow: 0 Current Checked out connections: 9
size: 10  Connections in pool: 0 Current Overflow: 0 Current Checked out connections: 10
QueuePool limit of size 10 overflow 0 reached, connection timed out, timeout 30 (Background on this error at: http://sqlalche.me/e/3o7r)

This suggests that Checked out connection are not the same as closed connections, as they are clearly holding onto something.

I'd like to understand what a Checked out connection is and what do I need to do to free the spot from the pool.

I should say that I'm working on a codebase that has all sort of weird instrumentation around creating sessions so probably the error is in there, but in order to fix that I want to understand what is happening.

Upvotes: 1

Views: 9095

Answers (1)

Paddy Alton
Paddy Alton

Reputation: 2348

Because databases cannot support an unlimited number of simultaneous connections, there has to be a limit on how many connections are active at any one time. If one keeps creating new connections and never closing them the database will eventually reject them - applications that communicate with the database should prevent this from happening.

There is some overhead in creating a new database connection, which can be significant. So for efficiency it is standard practice to use a connection pool. Connections, once created, are not destroyed, but rather 'checked out' to do some work and then released back to the pool when the work is done until they are needed again.

(Database connections can expire, so they will sometimes be destroyed and replaced)

The pool size is configurable but the default is ten connections for SQLAlchemy. If all the connections are simultaneously checked out then you can expect an error (there will be a timeout period during which SQLAlchemy waits to see if a connection gets freed up; this is also configurable).

In this case the logs show that you are checking out connections and not releasing them back to the pool when you're finished with them. Without seeing the code it's hard to know why. One possibility is this: you are right that calling session.close should release the connection back to the pool. But you shouldn't rely on this. Is there a chance the close call is never made, e.g. due to some exception occurring before it happens?

The standard way to proof your application against this is to use a context manager (a with Session.begin() as session: block is an example).

Context managers ('with' statements) can ensure any necessary cleanup code is called when the block exits, whether normally or via an exception (alternatively one could have a try/except/finally block with session.close in the finally section, but the context manager is cleaner syntax). In this case SQLAlchemy should make sure the connection is checked back into the pool.

Upvotes: 7

Related Questions