How to do Multithreading Put Request in Python

Question

What is the best and fastest pythonic way to program multithreading for a put request that is within a for loop? Now, as it is synchronous, it takes too long time to run the code. Therefore, we would like to include multithreading, to improve time.

Synchronous:

def econ_post_customers(self, file, data):
    try:
        for i in range(0, len(file['collection'])):
            rp = requests.put(url=self.url, headers=self.headers, params=self.params, data=data)
    except StopIteration:
        pass

We attempted to make threading, but starting threads on iterations just seems unnecessary, and we have 1000's of iterations, and we might run up on much more, so that would become a big mess with threads. Maybe including pools would solve the problem, but this is where i am stuck.

Anyone who has an idea on how to solve this?

Parallel:

def econ_post_customers(self, file, data):
    try:
        for i in range(0, len(file['collection'])):
            threading.Thread(target=lambda: request_put(url, self.headers, self.params, data)).start()
    except StopIteration:
        pass

def request_put(url, headers, params, single):
    return requests.put(url=url, headers=headers, params=params, data=single)

Any help is highly appreciated. Thank you for your time!

Booboo · Accepted Answer

If you want to use multithreading, then the following should work. However, I am a bit confused about a few things. You seem to be doing PUT requests in a loop but all with the same exact arguments. And I don't quite see how you can get a StopIteration exception in the code you posted. Also using a lambda expression as your target argument rather than just specifying the function name and then passing the arguments as a separate tuple or list (as is done below) is a bit unusual. Assuming that loop variable i in reality is being used to index one value that actually varies in the call to request_put, then function map could be a better choice than apply_async. It probably does not matter significantly for multithreading, but could make a performance difference for multiprocessing if you had a very large list of elements on which you were looping.

from multiprocessing.pool import ThreadPool

def econ_post_customers(self, file, data):
    MAX_THREADS = 100 # some suitable value
    n_tasks = len(file['collection'])
    pool_size = min(MAX_THREADS, n_tasks)
    pool = ThreadPool(pool_size)
    for i in range(n_tasks):
        pool.apply_async(request_put, args=(url, self.headers, self.params, data))
    # wait for all tasks to complete:
    pool.close()
    pool.join()

def request_put(url, headers, params, single):
    return requests.put(url=url, headers=headers, params=params, data=single)

How to do Multithreading Put Request in Python

Answers (2)

Related Questions