Reputation: 6828
I'm an expert parallel programmer in OpenMP and C++. Now I'm trying to understand parallelism in python and the multiprocessing
library.
In particular, I'm trying to parallelize this simple code, which randomly increment an array for 100 times:
from random import randint
import multiprocessing as mp
import numpy as np
def random_add(x):
x[randint(0,len(x)-1)] += 1
if __name__ == "__main__":
print("Serial")
x = np.zeros(8)
for i in range(100):
random_add(x)
print(x)
print("Parallel")
x = np.zeros(8)
processes = [mp.Process(target = random_add, args=(x,)) for i in range(100)]
for p in processes:
p.start()
print(x)
However,this is the following output:
Serial
[ 9. 18. 11. 15. 16. 8. 10. 13.]
Parallel
[ 0. 0. 0. 0. 0. 0. 0. 0.]
Why this happens? Well, I think I have an explanation: since we are in multiprocessing (and not multi-threading), each process as his own section of memory, i.e., each spawned process has his own x
, which is destroyed once random_add(x)
is terminated. As conclusion, the x
in the main program is never really updated.
Is this correct? And if so, how can I solve this problem? In a few words, I need a global reduce operation which sum the results of all the random_add
calls, obtaining the same result of the serial version.
Upvotes: 1
Views: 3398
Reputation: 842
You should use shared memory objects in your case:
from random import randint
import multiprocessing as mp
def random_add(x):
x[randint(0,len(x)-1)] += 1
if __name__ == "__main__":
print("Serial")
x = [0]*8
for i in range(100):
random_add(x)
print(x)
print("Parallel")
x = mp.Array('i', range(8))
processes = [mp.Process(target = random_add, args=(x,)) for i in range(100)]
for p in processes:
p.start()
print(x[:])
I've changed numpy array to ordinal list for the purpose of clearness of code
Upvotes: 3