yellowpisagor
yellowpisagor

Reputation: 156

Best way of sending multiple http requests at the same time when collect data from websites

I collect datas from a website for AI training by Python. I send requests to indexes of a website respectively. After parsing the html, if i find a meaningful data for my purpose in the html, I save it and send request to another index. There are more than 5 million websites that should be checked. So I think i should send multiple request at a time. Else, I can't finish them.

I am looking for best way to send multiple request at the same time. I know the ways: thread, multiple python scripts, async functions. But I am not sure about the best way.

Thank you.

Upvotes: 0

Views: 1933

Answers (1)

Lucas Abbade
Lucas Abbade

Reputation: 817

I would use Requests Futures, its a very simple async wrapper of Requests, you can use it as follows:

from concurrent.futures import as_completed
from requests_futures.sessions import FuturesSession

with FuturesSession() as session:
    futures = [session.get(url) for url in urls]
    for future in as_completed(futures):
        res = future.result()
        print(res.json())

Upvotes: 1

Related Questions