Filling a 3D array with multiprocessing

Question

I wish to fill the data1 array from the following script with multiprocessing. Right now, the script runs fine, but the array doesn't get filled. I tried implementing this, but due to using two iterables, I couldnt get it to work. Help appreciated; Thanks! By the way, I use jupyter notebook on the latest MacOS.

import numpy as np
import multiprocessing as mp
from itertools import product

#Generate random data:
data = np.random.randn(12,20,20)

#Create empty array to store the result
data1 = np.zeros((data.shape), dtype=np.float)

#Define the function
def fn(parameters):
    i   = parameters[0]
    j   = parameters[1]
    data1[:,i,j] =  data[:,i,j]

#Generate processes equal to the number of cores
pool = mp.Pool(processes=4)

# Generate values for each parameter: i.e. i and j
i = range(data.shape[1])
j = range(data.shape[2])

#generate a list of all combinations of the parameters
paramlist = list(product(i,j))

#call the function and multiprocessing
np.array(pool.map(fn,paramlist))
pool.close()

Roland Smith · Accepted Answer

What Pool.map does is to apply the function to the given data using worker processes. It then gathers the return data from the function and transmits that to the parent.

Since your function doesn't return anything, you get no results.

What happens is that in each worker the local copy of data1 is modified. :-)

When you have large amounts of data to be modified, multiprocessing is often not a good solution because of the overhead in moving data between the worker processes and the parent.

Try it using a single process first.

Filling a 3D array with multiprocessing

Answers (1)

Related Questions