How to overcome overhead in python multiprocessing?

Question

I am having problems with reaping benefits of multiprocessing in python. Basically, the computational time increases with every extra added core. So my guess is that it's due to overhead but I am not sure what exactly I am doing wrong and how it could be improved/overcome. My real problem is a bit more complex but I have prepared a simpler example to shed some light on my troubles.

Brief description:

I have a list of objects which are independent of each other. For every object I need to call a function which takes a dictionary of other objects as an input. Then it evaluates multiple times for each object in the initial list. So it's basically a loop within a loop.

Below is a code with a very simplified version of the problem

import time
import multiprocessing as mp
import numpy as np

def test_fun(iter_nr, mp_dict, i, return_dict):
    dumm = 0
    for j in range(iter_nr):
        
        for val in mp_dict.values():
            dumm += val
            
    return_dict[i] = dumm

    
manager = mp.Manager()
return_dict = manager.dict()
mp_dict = manager.dict()
for i in range(100):
    mp_dict[str(i)] = 1
    

nproc = [2,4,6,8,10,12,16,20]
nr_iter = 2*4*6*8*10
jobs = []

print('Total number of iterations: ', nr_iter)

if __name__ == '__main__':
    
    for n_proc in nproc:
        
        nr_iter_array = (nr_iter / n_proc ) * np.ones(n_proc)

        print('Nr CPUs: ', n_proc)
        print('Nr iterations per process: ', int(nr_iter_array[0]))

        start_time = time.time()

        for i in range(n_proc):

            p = mp.Process(target = test_fun, args = (int(nr_iter_array[i]), mp_dict, i, return_dict))
            p.start()
            jobs += [p]

        for job in jobs:
            job.join()

        end_time = time.time()
        print(round(end_time - start_time, 3), 'sec')

And here is the output

Total number of iterations:  3840
Nr CPUs:  2
Nr iterations per process:  1920
0.661 sec
Nr CPUs:  4
Nr iterations per process:  960
1.385 sec
Nr CPUs:  6
Nr iterations per process:  640
1.674 sec
Nr CPUs:  8
Nr iterations per process:  480
1.524 sec
Nr CPUs:  10
Nr iterations per process:  384
1.992 sec
Nr CPUs:  12
Nr iterations per process:  320
2.072 sec
Nr CPUs:  16
Nr iterations per process:  240
2.186 sec
Nr CPUs:  20
Nr iterations per process:  192
2.607 sec

As you can see the computation time increases with the number of cores. That is not something I would expect. Does anyone have any idea what is going on here and how to overcome this?

How to overcome overhead in python multiprocessing?

Answers (1)

Related Questions