How does gunicorn distribute requests across sync workers?

Question

I am using gunicorn to run a simple HTTP server¹ using e.g. 8 sync workers (processes). For practical reasons I am interested in knowing how gunicorn distributes incoming requests between these workers.

Assume that all requests take the same time to complete.

Is the assignment random? Round-robin? Resource-based?

The command I use to run the server:

gunicorn --workers 8 bind 0.0.0.0:8000 main:app

¹ I'm using FastAPI but I believe this is not relevant for this question.

aaron · Accepted Answer

Gunicorn does not distribute requests.

Each worker is spawned with the same LISTENERS (e.g. gunicorn.sock.TCPSocket) in Arbiter.spawn_worker(), and calls listener.accept() on its own.

The assignment in the blocking OS calls to the socket's accept() method — i.e. whichever worker is later woken up by the OS kernel and given the client connection — is an OS implementation detail that, empirically, is neither round-robin nor resource-based.

Reference from the docs

From https://docs.gunicorn.org/en/stable/design.html:

Gunicorn is based on the pre-fork worker model. ... The master never knows anything about individual clients. All requests and responses are handled completely by worker processes.

Gunicorn relies on the operating system to provide all of the load balancing when handling requests.

How does gunicorn distribute requests across sync workers?

Answers (2)

Reference from the docs

Other reading

Related Questions