Reputation: 896
I am using gunicorn to run a simple HTTP server1 using e.g. 8 sync workers (processes). For practical reasons I am interested in knowing how gunicorn distributes incoming requests between these workers.
Assume that all requests take the same time to complete.
Is the assignment random? Round-robin? Resource-based?
The command I use to run the server:
gunicorn --workers 8 bind 0.0.0.0:8000 main:app
1 I'm using FastAPI but I believe this is not relevant for this question.
Upvotes: 13
Views: 3252
Reputation: 43073
Gunicorn does not distribute requests.
Each worker is spawned with the same LISTENERS
(e.g. gunicorn.sock.TCPSocket
) in Arbiter.spawn_worker()
, and calls listener.accept()
on its own.
The assignment in the blocking OS calls to the socket's accept()
method — i.e. whichever worker is later woken up by the OS kernel and given the client
connection — is an OS implementation detail that, empirically, is neither round-robin nor resource-based.
From https://docs.gunicorn.org/en/stable/design.html:
Gunicorn is based on the pre-fork worker model. ... The master never knows anything about individual clients. All requests and responses are handled completely by worker processes.
Gunicorn relies on the operating system to provide all of the load balancing when handling requests.
Upvotes: 9
Reputation: 8859
In my case (also with FastAPI), I found that it starts with round-robin, and then turns to be stupid once all workers are full.
Example:
I am trying to fix that inefficient behavior for the 92 requests mentioned above. No success thus far.
Hopefully, someone else can add their insights??
Upvotes: 2