How to set the maximum number of concurrent workers in multiprocessing?

Question

Let's say we start with vartec's answer which shows how to use multiprocessing worker:

import multiprocessing

def worker(procnum, return_dict):
    '''worker function'''
    print str(procnum) + ' represent!'
    return_dict[procnum] = procnum


if __name__ == '__main__':
    manager = multiprocessing.Manager()
    return_dict = manager.dict()
    jobs = []
    for i in range(5):
        p = multiprocessing.Process(target=worker, args=(i,return_dict))
        jobs.append(p)
        p.start()

    for proc in jobs:
        proc.join()
    print return_dict.values()

I want to do the same thing, only limit the number of concurrent processes to X. How can I do that using workers?

Using pool/map is not really the best option here as I have a for loop like this:

for item in items:
    result = heavy_lifting_which_cannot_be_parallelized(item)
    process_result_in_a_way_that_can_be_parallelized(result)

Therefore I'd like to start process_result_in_a_way_that_can_be_parallelized and continue with my for loop. Not wait until the for loop has ended and then multiprocess - that would be much more time-consuming.

Vasiliy Faronov · Accepted Answer

You do not have to use map with a Pool. You can use apply_async to submit jobs to the pool on your own schedule.

pool = multiprocessing.Pool(processes=3)
for i in range(30):
    pool.apply_async(worker, (i, return_dict))
pool.close()
pool.join()
print return_dict.values()

How to set the maximum number of concurrent workers in multiprocessing?

Answers (1)

Related Questions