Reputation: 3
I'm a newbie at asyncio and aiohttp. Recently, I try to practice for understanding how does the eventloop actually working.
when I practice for sending urls simultaneously, I encounter some problems. According to my knowledge, create_task will make the coro get into the eventloop and await will make the eventloop jump out to do other task until the await task is done, but the following result is out of my mind. The upside in blockmain works like sync(block mode) and the downside just work as my expect(It's works like what I've known with both async/await and asyncio). I'm not really sure whether I get misunderstanding for the knowledge of async/await and asyncio in this situation or not. If someone who really know about it, give me the detailed answer please. It really bother me.
Sorry for my poor English.
Following is my code
urls = [
'https://www.104.com.tw/jobs/search/?keyword=python&order=1&page=1&jobsource=2018indexpoc&ro=0',
'https://www.104.com.tw/jobs/search/?keyword=python&order=1&page=2&jobsource=2018indexpoc&ro=0',
'https://www.104.com.tw/jobs/search/?keyword=python&order=1&page=3&jobsource=2018indexpoc&ro=0',
'https://www.104.com.tw/jobs/search/?keyword=python&order=1&page=4&jobsource=2018indexpoc&ro=0',
'https://www.104.com.tw/jobs/search/?keyword=python&order=1&page=5&jobsource=2018indexpoc&ro=0',
'http://www.httpbin.org:12345/',
'https://www.104.com.tw/jobs/search/?keyword=python&order=1&page=6&jobsource=2018indexpoc&ro=0',
'https://www.104.com.tw/jobs/search/?keyword=python&order=1&page=7&jobsource=2018indexpoc&ro=0',
'https://www.104.com.tw/jobs/search/?keyword=python&order=1&page=8&jobsource=2018indexpoc&ro=0',
'https://www.104.com.tw/jobs/search/?keyword=python&order=1&page=9&jobsource=2018indexpoc&ro=0',
'https://www.104.com.tw/jobs/search/?keyword=python&order=1&page=10&jobsource=2018indexpoc&ro=0']
async def fetch_(link):
# loop = asyncio.get_event_loop()
# print(asyncio.all_tasks(loop))
async with ClientSession(timeout=ClientTimeout(total=10)) as session:
async with session.get(link) as response:
html_body = await response.text()
print(f"{link} is done")
async def blockmain():
# ========================= following 2 lines can't work as my expect
for link in urls:
await asyncio.create_task(fetch_(link))
# second part
# ========================= following 3 line can work as my expect
# loop 1
tasks = [asyncio.create_task(fetch_(link)) for link in urls]
for t in tasks:
await t
# loop 2
tasks = [asyncio.create_task(fetch_(link)) for link in urls]
for t in tasks:
await t
asyncio.run(blockmain())
I want to know the reason why the program will run like sync(block mode) when I await asyncio.create_task in the for loop, but work async that await task after create all tasks.
Thanks.
Upvotes: 0
Views: 997
Reputation: 11009
In the first case you are not running the tasks concurrently.
for link in urls:
await asyncio.create_task(fetch_(link))
The expression asyncio.create_task schedules the routine fetch_
as a task. The await keyword suspends the current task (blockmain
) and waits for the fetch_
task to complete. Those are the only two tasks at that point. When the fetch_
task finishes, the main task continues. It goes through the loop again with a new value for link
. That process repeats. You never have two tasks fetch_
running at the same time, since you await each task as you create it. There is no useful concurrent execution.
In the second case you get concurrent execution, since you create all the tasks before you await for the first time. The instances of fetch_
take turns, switching from one task to another each time one of the tasks needs to await something.
However, the code for your second case is longer than it needs to be. See the documentation for the asyncio.gather
function. You could replace all three lines with one line, like this:
await asyncio.gather(fetch_(link) for link in urls)
The gather function automatically creates tasks and awaits until they are all finished.
Upvotes: 0