Modify 3D numpy array in slices in parallel

Question

I have a complex numpy array signal with dimensions [10,1000,50000] I need to modify this array in slices. This is done in a for loop:

for k in range(signal.shape[2]):
    signal[:,:,k] = myfunction(signal[:,:,k], constant1, constant2, constant5=constant5, constant6=constant6)

I optimized myfunction as much as possible. When I run the script it takes quite some time but only uses 1 of 24 CPU's.

The code can not be rewritten to perform myfunction on the entire array with numpy.

Therefore I want to speed up my code with parallel computing. There seem to many different approach for parallel computing in python. Which one seems to be the best for my problem? And how can I implement it?

Deena · Accepted Answer

Joblib provides easy execution for such 'embarrassingly-parallel' tasks:

import numpy as np

# Initialize array and define function 
np_array = np.random.rand(100,100,100)
my_function = lambda x: x / np.sum(x)

# Option 1: Loop over array and apply function
serial_result = np_array.copy()
for i in range(np_array.shape[2]):
    serial_result[:,:,i] = my_function(np_array[:,:,i])

Now using parallel execution with joblib:

# Option 2: Parallel execution
# ... Apply function in Parallel 
from joblib import delayed, parallel
sub_arrays = Parallel(n_jobs=6)(                            # Use 6 cores 
                      delayed(my_function)(np_array[:,:,i]) # Apply my_function 
                      for i in range(np_array.shape[2]))    # For each 3rd dimension

# ... Concatenate the list of returned arrays
parallel_results = np.stack(sub_arrays, axis=2)

# Compare results 
np.equal(serial_result, parallel_results).all() # True

Modify 3D numpy array in slices in parallel

Answers (1)

Related Questions