External API call in async handler

Question

I have the following code in FastAPI route handler. client is aiohttp.ClientSession(). The service is singleton, meaning all use the same class where I have this client.

async def handler():
    log...
    async with client.post(
                f"{config.TTS_SERVER_ENDPOINT}/v2/models/{self.MODEL_NAME}/infer",
                json=request_payload,
            ) as response:
                response_data = await response.json()  # ✅ Get JSON response
    log...

I am load testing the system and getting in logs and in jmeter results that I am only handling 2-3 requests per second - is that reasonable?

I would expect to see a lot of messages "start" and then a lot of "finish" messages, but this is not the case.

I see that the interval between start and finish logs are getting larger and larger, starting from 0.5 seconds to 5-6 seconds - what could be the bottleneck here?

I am running FastAPI in Docker with one CPU, 2G memory and with this command :

CMD ["uv", "run", "gunicorn", "-k", "uvicorn.workers.UvicornWorker", "-w", "4", "--threads", "16","--worker-connections", "2000", "-b", "0.0.0.0:8000","--preload", "src.main:app"]

where uv is the package manager I am using.

What is going on here? It does not seem reasonable for me that I am only handling such amount of requests.

External API call in async handler

Answers (0)

Related Questions