Reputation: 85
I'm trying to learn how to use the multiprocessing package in Python, and I've written the following code, which randomly generates a large 2D array and then works out how many numbers in each row are within a specified interval (in this case between 4 and 8):
import time
import multiprocessing as mp
import numpy as np
def how_many_within_range(row, minimum, maximum):
count = 0
for n in row:
if minimum <= n <= maximum:
count += 1
return count
if __name__ == '__main__':
data = np.random.randint(0, 10, size=[10000000, 5])
print(data[:5])
start_time = time.perf_counter()
# With parallelisation
with mp.Pool(mp.cpu_count()) as pool:
results = [ pool.apply(how_many_within_range, args=(row, 4, 8)) \
for row in data ]
# Without parallelisation
# results = [ how_many_within_range(row, 4, 8) for row in data ]
print(f'Time elapsed: {time.perf_counter() - start_time}')
print(results[:5])
Without multiprocessing, the code runs in about 40 seconds, but with it, the program is much slower and doesn't finish in a realistic time. I'm pretty sure I've correctly followed the tutorial I was using, so what am I doing wrong?
Upvotes: 3
Views: 227
Reputation: 177406
.apply()
is the wrong function for this case. .starmap()
is more appropriate, but for this simple case the overhead of starting the processes and transferring the large amount of data interprocess makes it overall slower.
import time
import multiprocessing as mp
import numpy as np
def how_many_within_range(row, minimum, maximum):
count = 0
for n in row:
if minimum <= n <= maximum:
count += 1
return count
if __name__ == '__main__':
data = np.random.randint(0, 10, size=[1000000, 5])
print(data[:5])
# With parallelisation
start_time = time.perf_counter()
with mp.Pool() as pool:
results = pool.starmap(how_many_within_range, ((row,4,8) for row in data), chunksize=1000)
print(f'Time elapsed: {time.perf_counter() - start_time}')
print(results[:5])
# Without parallelisation
start_time = time.perf_counter()
results = [ how_many_within_range(row, 4, 8) for row in data ]
print(f'Time elapsed: {time.perf_counter() - start_time}')
print(results[:5])
Output:
[[1 4 8 9 2]
[9 1 6 7 0]
[0 7 6 8 4]
[4 5 6 9 9]
[6 6 9 9 1]]
Time elapsed: 3.3232607
[2, 2, 4, 3, 2]
Time elapsed: 2.4664016999999996
[2, 2, 4, 3, 2]
Upvotes: 1
Reputation: 889
Why do you need to use multiprocessing in such a simple function, and even with numpy arrays? Try use this code
%%timeit
np.sum((data>=4)&(data<=8), axis=1)
198 ms ± 3.44 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
There is no need to iterate through the array elements in the for loop and it is executed instantly
Upvotes: 0
Reputation: 58
From the documentation it looks like Pool.apply() is blocking, so you get the overhead of starting a process but not gaining parallelism.
Upvotes: 1