dramatic slow down using multiprocess and numpy in python

Question

I write a python code for Q-learning algorithm and I have to run it multiple times since this algorithm has random output. Thus I use multiprocessing module. The structure of the code is as follows

import numpy as np
import scipy as sp
import multiprocessing as mp
# ...import other modules...

# ...define some parameters here...

# using multiprocessing
result = []
num_threads = 3
pool = mp.Pool(num_threads)
for cnt in range(num_threads):
    args = (RL_params+phys_params) # arguments
    result.append(pool.apply_async(Q_learning, args))

pool.close()
pool.join()

There is no I/O operation in my code and my work station has 6 cores (12 threads) and enough memory for this job. When I run the code with num_threads=1, it takes me only 13 seconds and this mission only occupies 1 thread with CPU usage 100% (using top command).

click to see picture of CPU status

However, if I run it with num_threads=3 (or more), it shall takes more than 40 seconds and this mission will occupy 3 threads with each thread use 100% CPU core.

click to see picture of CPU status

I can't understand this slowing down because there is no parallelization in all self-defined functions and no I/O operation. It is also interesting to notice that when num_threads=1, CPU usage is always less than 100%, but when num_threads is larger than 1, CPU usage may sometimes be 101% or 102%.

On the other hand, I wrote another simple test file which does not import numpy and scipy, then this problem never show. I have noticed this question why isn't numpy.mean multithreaded? and it seem my problem is due to the automatic parallelization of some methods in numpy (such dot). But as I shown in the pictures, I can't see any parallelization when I run a single job.

dramatic slow down using multiprocess and numpy in python

Answers (1)

Related Questions