FakeDisy
FakeDisy

Reputation: 13

Add a delay between futures in Python

In official docs of Python exist next example:

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://nonexistant-subdomain.python.org/']

# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
    with urllib.request.urlopen(url, timeout=timeout) as conn:
        return conn.read()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

It works great, but i need set a delay between created futures, so that requests are not sent in same time, but with a delay of, for example, 100 ms from each other.

Link on docs -> https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor-example

Can I implement this by changing this line future_to_url = {executor.submit(load_url, url, 60): url for url in URLS} or will it require reworking all the code?

Upvotes: 1

Views: 79

Answers (1)

Adon Bilivit
Adon Bilivit

Reputation: 27211

Use a wrapper function that handles the delay as follows:

import concurrent.futures
import urllib.request
from time import sleep

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://nonexistant-subdomain.python.org/']

# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
    with urllib.request.urlopen(url, timeout=timeout) as conn:
        return conn.read()

def submit_with_delay(exe, func, url, timeout=60):
    future = exe.submit(func, url, timeout)
    sleep(0.1)
    return future

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {submit_with_delay(executor, load_url, url): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

Upvotes: 0

Related Questions