Nandha
Nandha

Reputation: 31

Issues with FastAPI Uvicorn workers | Only one worker is handling the request

I have a sample FastAPI application, with a single endpoint which returns the result a after a delay of 5 seconds. Please find the code below.

from fastapi import FastAPI
import uvicorn, os, time
app = FastAPI()

@app.get("/delayed-response")
async def read_root():
    time.sleep(5)
    return {"message": f"This is a delayed response!- {os.getpid()}"}

if __name__ == "__main__":
    uvicorn.run(
        "main:app",
        host='127.0.0.1',
        port=9090,
        reload=False,
        workers=10, #Just having 10 workers to understand the concept.
    )

Now I also have a script that will be sending parallel requests to the endpoint http://localhost:9090/delayed-response

Upon starting the Application I came to know that all 10 workers with process id from 9 to 18 are started successfully. enter image description here

When I send 20 parallel request to this endpoint I observe that only first few requests are handled parallel by the workers and later only one worker is handling all the remaining requests. (May be from a certain point). Attaching few screenshots of the response:- enter image description here

enter image description here

Can anyone explain this behavior?

  1. When sudden requests are being sent to an endpoint, why is the first set of requests are handled in parallel (which is equal to the number of workers) later only one worker starts handling the requests?
  2. Is it a behaviour of the FastAPI or Uvicorn?

Note: Let us consider this application to be a synchronous application. I understand that there are few concepts to make is work in a concurrent way, but at this point I wish to understand this working.

My Expectation : My expectation is that when 10 workers are configured and 30 requests are sent in parallel, the 30th request to an endpoint with 5s delay should get the response at ~15th to 17th second.

Upvotes: 1

Views: 4095

Answers (2)

booooh
booooh

Reputation: 41

Note that you're mixing an async handler (async def read_root), with a blocking (synchronous) call to time sleep.

Fastapi can handle blocking requests via its internal threadpool, but to do so, you need to define a regular handler (def read_root).

If you have an async handler, you should use: await asyncio.sleep(5) instead

See here: https://docs.python.org/3/library/asyncio-task.html#task-groups And here: https://fastapi.tiangolo.com/async/#in-a-hurry

Upvotes: 1

protoak
protoak

Reputation: 11

When lots of requests come in faster than the workers can handle, some requests might have to wait in line. This waiting can happen because of things like not enough resources or too many requests trying to use the same resources at once. In your situation, since your app processes requests one after the other, this waiting might be happening because each request takes time to finish before the next one can start. So, if too many requests come in at once, they'll start stacking up, and eventually, only one worker will be left to handle them all.

Upvotes: 1

Related Questions