Leo
Leo

Reputation: 121

asyncio behavior in jupyter vs script

I have the following code:

async def fetch(session, url):
    video_id = url.split('/')[-2]
    async with session.get(url) as response:
        data = await response.text()
        async with aiofiles.open(f'{video_id}.json', 'w') as f:
            await f.write(data)


async def main(loop, urls):
    async with aiohttp.ClientSession(loop=loop) as session:
        tasks = [fetch(session, url) for url in urls]
        await asyncio.gather(*tasks)


if __name__ == '__main__':
    links = generate_links()
    loop = asyncio.get_event_loop()
    await main(loop, links)

The script runs smoothly in the Jupyter notebook but it won't run from within a .py script due to SyntaxError: 'await' outside function.

I'm trying to understand what is happening here and why this is the case.

Upvotes: 2

Views: 2132

Answers (1)

Leo
Leo

Reputation: 121

For anybody else trying to figure it out, Galunid's tip was spot on. The issue has been the way the loop object has been used. Removing it from within the ClientSession() forces the client to use asyncio.get_event_loop() as default.

The final form is given below.

async def fetch(session, link):
    video_id = link.split('/')[-2]
    async with session.get(link) as response:
        data = await response.text()
        async with aiofiles.open(f'{video_id}.json', 'w') as f:
            await f.write(data)


async def main(urls):
    async with aiohttp.ClientSession() as session: 
        tasks = [fetch(session, url) for url in urls]
        await asyncio.gather(*tasks)


if __name__ == '__main__':
    links = generate_links()
    loop = asyncio.get_event_loop()
    asyncio.run(main(links))

Jupyter notebooks make use of this idea to handle the loop event in the background, allowing one to await the result directly.

Upvotes: 1

Related Questions