Viacheslav Zhukov
Viacheslav Zhukov

Reputation: 1360

How to properly set pool_size (and max_overflow) in SQLAlchemy for ASGI app

We're building an ASGI app using fastapi, uvicorn, sqlalchemy and PostgreSQL. The question is: how should we set pool_size in create_async_engine to not make it a bottleneck comparing to a WSGI app with multiple workers?

As far as I understand, in a WSGI app if we run N processes with M threads each (and pool_size=M) we'll get at most N * M connections. But what about ASGI (if we have a single process) - how many connections can be opened? Also pool_size (since we create only one AsyncEngine per process)? Then we should set it as pool_size=N * M?

And, if we simply increase this number, than we'll be able to make more simultaneous awaitable requests to the database, right?

What is the intuition behind it?

Thanks in advance!

Upvotes: 22

Views: 9435

Answers (2)

jsbueno
jsbueno

Reputation: 110506

async work will by default just go "as fast as possible" on a single thread. That means that putting N = (2 X CPU Count) processes will just saturate your CPU cores in a best-case scenario. What most likely could happen is a bottleneck on the Database I/O side. Two years have passed from the question, and on the meantime the PsycoPG3 connector (the default when one installs the "psycopg" Python project as a dependency), with native AsyncIO has been released - it is certainly a better choice than PsycoPG2 or old "aiopg" that is a smaller project using synchronous PsycoPG2 (probably with a thread-pool) under the hood.

That said incoming HTTP requests will spawn new tasks, regardless of the worker being saturated with unfinished database-related async tasks. They will just "pile up", and it is just natural that db requests will resolve first, and free-up resources: async tasks are really light-wait and tens of thousands of them will take just the memory it takes to represent their state as they wait for resolution - but the "everything should go well" part might not be the case.

On the other hand, it is simply not possible to know a "max requests to be attended" just like the N process X M threads approach. New connections are simply accepted, they will run and create the I/O tasks they need to resolve. So, any manual tuning that is needed in the number of threads "M" is gone: the system will simply saturate naturally.

So, to be on the safe side, maybe it would be nice to use a mechanism to avoid accepting a new HTTP connection if there are a lot of ongoing tasks. The problem is that can't be done inside the view code, because a task is already accepted at that point - an ASGI middleware that could use a semaphore to count the number of active views (or total number of tasks), and signal an upstream layer (for example, by returning an HTTP 131 error) that the resources are saturated might be a good thing to do. (for a system running in a container orchestration scenario (Kubernetes), that could trigger the provisioning of extra pods). But otherwise, just "let it go" should be just fine, and just let other layers decide when to do horizontal scaling. Your other bottleneck, of course, will be the database at some point.

Upvotes: 1

Selimzhan Ozer
Selimzhan Ozer

Reputation: 34

in sqlalchemy to add pool size and max overflow u can use

engine = create_async_engine(
    settings.ASYNC_SQLALCHEMY_URL,
    echo=settings.SQLALCHEMY_ECHO,
    pool_size=20,
    max_overflow=10,
)

In sqlalchemy you dont have max pool size but recommend pool size is 20

Upvotes: -2

Related Questions