Reputation: 478
What is the best and fastest pythonic way to program multithreading for a put request that is within a for loop? Now, as it is synchronous, it takes too long time to run the code. Therefore, we would like to include multithreading, to improve time.
Synchronous:
def econ_post_customers(self, file, data):
try:
for i in range(0, len(file['collection'])):
rp = requests.put(url=self.url, headers=self.headers, params=self.params, data=data)
except StopIteration:
pass
We attempted to make threading, but starting threads on iterations just seems unnecessary, and we have 1000's of iterations, and we might run up on much more, so that would become a big mess with threads. Maybe including pools would solve the problem, but this is where i am stuck.
Anyone who has an idea on how to solve this?
Parallel:
def econ_post_customers(self, file, data):
try:
for i in range(0, len(file['collection'])):
threading.Thread(target=lambda: request_put(url, self.headers, self.params, data)).start()
except StopIteration:
pass
def request_put(url, headers, params, single):
return requests.put(url=url, headers=headers, params=params, data=single)
Any help is highly appreciated. Thank you for your time!
Upvotes: 0
Views: 240
Reputation: 44283
If you want to use multithreading, then the following should work. However, I am a bit confused about a few things. You seem to be doing PUT
requests in a loop but all with the same exact arguments. And I don't quite see how you can get a StopIteration
exception in the code you posted. Also using a lambda
expression as your target argument rather than just specifying the function name and then passing the arguments as a separate tuple or list (as is done below) is a bit unusual. Assuming that loop variable i
in reality is being used to index one value that actually varies in the call to request_put
, then function map
could be a better choice than apply_async
. It probably does not matter significantly for multithreading, but could make a performance difference for multiprocessing if you had a very large list of elements on which you were looping.
from multiprocessing.pool import ThreadPool
def econ_post_customers(self, file, data):
MAX_THREADS = 100 # some suitable value
n_tasks = len(file['collection'])
pool_size = min(MAX_THREADS, n_tasks)
pool = ThreadPool(pool_size)
for i in range(n_tasks):
pool.apply_async(request_put, args=(url, self.headers, self.params, data))
# wait for all tasks to complete:
pool.close()
pool.join()
def request_put(url, headers, params, single):
return requests.put(url=url, headers=headers, params=params, data=single)
Upvotes: 1
Reputation: 107
Do try grequests module which works with gevent(requests is not designed for async).
If you see this you will get great results. (If this is not working pls do say).
Upvotes: 1