Reputation: 711
I'm working with asyncio and aiohttp to call an API many times. While can print the responses, I want to collate the responses into a combined structure - a list or pandas dataframe etc.
In my example code I'm connecting to 2 urls and printing a chunk of the response. How can I collate the responses and access them all?
import asyncio, aiohttp
async def get_url(session, url, timeout=300):
async with session.get(url, timeout=timeout) as response:
http = await response.text()
print(str(http[:80])+'\n')
return http # becomes a list item when gathered
async def async_payload_wrapper(async_loop):
# test with 2 urls as PoC
urls = ['https://google.com','https://yahoo.com']
async with aiohttp.ClientSession(loop=async_loop) as session:
urls_to_check = [get_url(session, url) for url in urls]
await asyncio.gather(*urls_to_check)
if __name__ == '__main__':
event_loop = asyncio.get_event_loop()
event_loop.run_until_complete(async_payload_wrapper(event_loop))
I've tried printing to a file, and that works but it's slow and I need to read it again for further processing. I've tried appending to a global variable without success. E.g. using a variable inside get_url that is defined outside it generates an error, eg:
NameError: name 'my_list' is not defined
or
UnboundLocalError: local variable 'my_list' referenced before assignment
Upvotes: 1
Views: 1181
Reputation: 711
Thanks @python_user that's exactly what I was missing and the returned type is indeed a simple list. I think I'd tried to pick up the responses inside the await part which doesn't work.
My updated PoC code below.
Adapting this for the API, JSON and pandas should now be easy : )
import asyncio, aiohttp
async def get_url(session, url, timeout=300):
async with session.get(url, timeout=timeout) as response:
http = await response.text()
return http[:80] # becomes a list element
async def async_payload_wrapper(async_loop):
# test with 2 urls as PoC
urls = ['https://google.com','https://yahoo.com']
async with aiohttp.ClientSession(loop=async_loop) as session:
urls_to_check = [get_url(session, url) for url in urls]
responses = await asyncio.gather(*urls_to_check)
print(type(responses))
print(responses)
if __name__ == '__main__':
event_loop = asyncio.get_event_loop()
event_loop.run_until_complete(async_payload_wrapper(event_loop))
Upvotes: 1