InnocentBystander
InnocentBystander

Reputation: 711

How to collect the responses from aiohttp sessions

I'm working with asyncio and aiohttp to call an API many times. While can print the responses, I want to collate the responses into a combined structure - a list or pandas dataframe etc.

In my example code I'm connecting to 2 urls and printing a chunk of the response. How can I collate the responses and access them all?

import asyncio, aiohttp

async def get_url(session, url, timeout=300):
    async with session.get(url, timeout=timeout) as response:
        http = await response.text()
    print(str(http[:80])+'\n')
    return http    # becomes a list item when gathered
   
async def async_payload_wrapper(async_loop):
    # test with 2 urls as PoC
    urls = ['https://google.com','https://yahoo.com']
    async with aiohttp.ClientSession(loop=async_loop) as session:
        urls_to_check = [get_url(session, url) for url in urls]
        await asyncio.gather(*urls_to_check)

if __name__ == '__main__':
    event_loop = asyncio.get_event_loop()
    event_loop.run_until_complete(async_payload_wrapper(event_loop))

I've tried printing to a file, and that works but it's slow and I need to read it again for further processing. I've tried appending to a global variable without success. E.g. using a variable inside get_url that is defined outside it generates an error, eg: NameError: name 'my_list' is not defined or UnboundLocalError: local variable 'my_list' referenced before assignment

Upvotes: 1

Views: 1181

Answers (1)

InnocentBystander
InnocentBystander

Reputation: 711

Thanks @python_user that's exactly what I was missing and the returned type is indeed a simple list. I think I'd tried to pick up the responses inside the await part which doesn't work.

My updated PoC code below.
Adapting this for the API, JSON and pandas should now be easy : )

import asyncio, aiohttp

async def get_url(session, url, timeout=300):
    async with session.get(url, timeout=timeout) as response:
        http = await response.text()
    return http[:80]    # becomes a list element
   
async def async_payload_wrapper(async_loop):
    # test with 2 urls as PoC
    urls = ['https://google.com','https://yahoo.com']
    async with aiohttp.ClientSession(loop=async_loop) as session:
        urls_to_check = [get_url(session, url) for url in urls]
        responses = await asyncio.gather(*urls_to_check)
    print(type(responses))
    print(responses)

if __name__ == '__main__':
    event_loop = asyncio.get_event_loop()
    event_loop.run_until_complete(async_payload_wrapper(event_loop))

Upvotes: 1

Related Questions