rivamarco
rivamarco

Reputation: 767

get completed aiohttp parallel requests after timeout

I'm using aiohttp to perform some parallel HTTP post requests.

I have to set a timeout globally (on ClientSession) in order to not exceed a threshold.

The problem is that I would like to take the (partial in the sessions) responses that I have completed before the threshold, so for example if the session contains 10 requests and before the timeout I've completed 5 of these, I want to take the result of these 5. But I've not figured out how to do this.

The code I'm using is something like that:

import aiohttp
import asyncio
import requests

async def fetch(session):
    async with session.get("https://amazon.com") as response:
        return response.status

async def main(n, timeout):
    async with aiohttp.ClientSession(timeout=timeout) as session:
        return await asyncio.gather(*(fetch(session) for _ in range(n)))

timeout = aiohttp.ClientTimeout(total=0.4)
res = asyncio.run(main(10, timeout))
print(res)

With timeout = 0.4 it raises asyncio.TimeoutError and I don't know how to get the partial performed responses.

For example, if I set the timeout at 5 seconds, all requests are completed and I obtain a list of ten 200.

Thank you

Upvotes: 2

Views: 1454

Answers (1)

Pynchia
Pynchia

Reputation: 11590

Use asyncio.wait instead of asyncio.gather

Also see this QA for further info on the differences.

Note: wait's timeout argument is expressed in seconds.

Most of all, you might not need to specify a timeout to the ClientSession at all.

The reworked code (for increased variance in the response time I have added a few different sources and 20 requests are performed)

import asyncio
import random
import aiohttp
import requests

sources = ["amazon.com", "hotmail.com", "stackoverflow.com"]

async def fetch(session):
    rnd = random.choice(sources)
    async with session.get(f"https://{rnd}") as response:
        return response.status

async def main(n, timeout):
    async with aiohttp.ClientSession() as session:
        completed, pending = await asyncio.wait(
            [fetch(session) for _ in range(n)],
            timeout=timeout
        )
    for t in pending:  # cancel the pending tasks
        t.cancel()
    return [t.result() for t in completed]

timeout = 0.5
res = asyncio.run(main(20, timeout))
print(res)

with increasing values of timeout as 0.3, 0.5 and 0.8 produces

(.venv) async_req_timeout $ python async_req_timeout.py 
[200, 200]

(.venv) async_req_timeout $ python async_req_timeout.py 
[200, 200, 200, 200, 200, 200, 200, 200, 200, 200]

(.venv) (base) async_req_timeout $ python async_req_timeout.py 
[200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200, 200]

Upvotes: 1

Related Questions