Reputation: 3862
I have 2 sockets with each 20 cores so I would like to speed up some processes. But multiprocess are always slower than the serial "approach". Is there a reason for that? Am I not doing this the most efficient way? Is it due to a lack of communication (pipe or queue) between processes?
import time
from multiprocessing import Pool
import numpy as np
#Classical approach by serial
startime = time.time()
def f(x):
return np.sqrt(x)
f(np.arange(1000))
print("---%s seconds ---" % (time.time() - startime))
#Multiprocess test
startime = time.time()
if __name__ == '__main__':
p = Pool(40)
test = p.map(np.sqrt,np.arange(1000),chunksize=1)
print("---%s seconds ---" % (time.time() - startime))
---- EDIT ---
With parallel i need 2.92 sec and in serial i need less 1 sec...
Upvotes: 0
Views: 758
Reputation: 40894
Starting processes is slow, even on a modern OS. Calculating 1000 square roots is blazing fast on modern hardware.
To reap benefits of parallel processing, you have to spend much more time on actual computation than on starting things up. Try computing something more expensive, like 1000 bcrypt
s, or slow, like hitting 1000 different URLs.
With compute-intensive tasks, where every process eats 100% CPU, there's no point to have more processes than CPU cores.
Upvotes: 1