Reputation: 33
It is my first question here on Stack Overflow so I apologize if I did something stupid or missed something.
I am trying to make asynchronous aiohttp GET requests to many api endpoints at a time to check the status of these pages: the result should be a triple of the form (url, True, "200") in case of a working link and (url, False, response_status) in case of a "problematic link". This is the atomic function for each call:
async def ping_url(url, session, headers, endpoint):
try:
async with session.get((url + endpoint), timeout=5, headers=headers) as response:
return url, (response.status == 200), str(response.status)
except Exception as e:
test_logger.info(url + ": " + e.__class__.__name__)
return url, False, repr(e)
These are wrapped into a function using asyncio.gather() which also creates the aiohttp Session:
async def ping_urls(urllist, endpoint):
headers = ... # not relevant
async with ClientSession() as session:
try:
results = await asyncio.gather(*[ping_url(url, session, headers, endpoint) \
for url in urllist],return_exceptions=True)
except Exception as e:
print(repr(e))
return results
The whole called from a main that looks like this:
urls = ... # not relevant
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(ping_urls(urls, endpoint))
except Exception as e:
pass
finally:
loop.close()
This works most of the time, but if the list is pretty long, I noticed that as soon as I get one
TimeoutError
the execution loop stops and I get TimeoutError for all other urls after the first one that timed out. If I omit the timeout in the innermost function I get somehow better results, but then it is not that fast anymore. Is there a way to control the Timeouts for the single api calls instead of a big general timeout for the whole list of urls?
Any kind of help would be extremely appreciated, I got stuck with my bachelor thesis because of this issue.
Upvotes: 3
Views: 5198
Reputation: 63
I struggled with the exceptions as well. I then found the hint, that I can also show the type of the Exception. And with that create appropriate Exception handling.
try: ...
except Exception as e:
print(f'Error: {e} of Type: {type(e)}')
So, with this you can find out, what kind of errors occur and you can catch and handle them individually.
e.g.
try: ...
except aiohttp.ClientConnectionError as e:
# deal with this type of exception
except aiohttp.ClientResponseError as e:
# handle individually
except asyncio.exceptions.TimeoutError as e:
# these kind of errors happened to me as well
Upvotes: 0
Reputation: 2096
You may want to try setting a session timeout for your client session. This can be done like:
async def ping_urls(urllist, endpoint):
headers = ... # not relevant
timeout = ClientTimeout(total=TIMEOUT_SECONDS)
async with ClientSession(timeout=timeout) as session:
try:
results = await asyncio.gather(
*[
ping_url(url, session, headers, endpoint)
for url in urllist
],
return_exceptions=True
)
except Exception as e:
print(repr(e))
return results
This should set the ClientSession instance to have TIMEOUT_SECONDS
as the timeout. Obviously you will need to set that value to something appropriate!
Upvotes: 2