Lau Real
Lau Real

Reputation: 321

python3.6 start 1 million requests with aiohttp and asyncio

I'm trying to make 1 million requests with aiohttp and asyncio continuously in 10 times which 10k at each time. When I print the start time of each request, I found that the 1 million requests are NOT start at a very closed time but in serval minutes. In my understanding, the 1 million requests will be sent without any wait(or just say in microseconds?) Hope someone can help me give a suggestion how to change the code, and my code is as below. Thanks in advance!

import asyncio
import requests
import json
import pymysql
from aiohttp import ClientSession
from datetime import datetime
import uvloop


# login config
URL_LOGIN = "https://test.com/user/login"
APP_ID = "sample_app_id"
APP_SECRET = "sample_secret"


async def login_user(phone, password, session, i):
    start_time = datetime.now()
    h = {
        "Content-Type": "application/json"
    }
    data = {
        "phone": phone,
        "password": password,
        "appid": APP_ID,
        "appsecret": APP_SECRET
            }
    try:
        async with session.post(url=URL_LOGIN, data=json.dumps(data), headers=h) as response:
            r = await response.read()
            end_time = datetime.now()
            cost = (end_time-start_time).seconds
            msg = "number %d request,start_time:%s, cost_time: %d, response: %s\n" % (i, start_time, cost, r.decode())
            print("running %d" % i, datetime.now())
    except Exception as e:
        print("running %d" % i)
        msg = "number %d request raise error" % i+str(e)+"\n"
    with open("log", "a+") as f:
        f.write(msg)


async def bound_login(sem, phone, password, session, i):
    async with sem:
        await login_user(phone, password, session, i)


async def run_login(num):
    tasks = []
    sem = asyncio.Semaphore(10000)
    async with ClientSession() as session:
        for i in range(num):
            task = asyncio.ensure_future(bound_login(sem, str(18300000000+i), "123456", session, i))
            tasks.append(task)
        responses = asyncio.gather(*tasks)
        await responses

start = datetime.now()
number = 100000
loop = uvloop.new_event_loop()
asyncio.set_event_loop(loop)
future = asyncio.ensure_future(run_login(number))

Upvotes: 0

Views: 3150

Answers (1)

user4815162342
user4815162342

Reputation: 155580

When I print the start time of each request, I found that the 1 million requests are NOT start at a very closed time but in serval minutes.

Your code does issue a total of 1 million requests, but with the constraint that no more than 10 thousand of them runs in parallel at any given time. This is like having 10k request slots at your disposal - the first 10,000 requests will be started immediately, but the 10,001st will have to wait for a previous request to finish so it can get a free slot.

This is why 1 million requests cannot start instantaneously or near-instantaneously, most of them have to wait for some download to finish, and that takes time.

In my understanding, the 1 million requests will be sent without any wait

The current code explicitly makes the requests wait in order to prevent more than 10k of them running in parallel. If you really want to (try to) make a million parallel requests, remove the semaphore and create the ClientSession using a connector with limit set to None.

However, be aware that maintaining a million open connections will likely not work due to limits of the operating system and the hardware. (You should still be able to start the connections near-instantaneously, but I'd expect most of them to exit with an exception shortly afterwards.)

Upvotes: 4

Related Questions