Reputation: 299
I have a complex numpy array signal with dimensions [10,1000,50000] I need to modify this array in slices. This is done in a for loop:
for k in range(signal.shape[2]):
signal[:,:,k] = myfunction(signal[:,:,k], constant1, constant2, constant5=constant5, constant6=constant6)
I optimized myfunction as much as possible. When I run the script it takes quite some time but only uses 1 of 24 CPU's.
The code can not be rewritten to perform myfunction on the entire array with numpy.
Therefore I want to speed up my code with parallel computing. There seem to many different approach for parallel computing in python. Which one seems to be the best for my problem? And how can I implement it?
Upvotes: 3
Views: 1192
Reputation: 6213
Joblib
provides easy execution for such 'embarrassingly-parallel' tasks:
import numpy as np
# Initialize array and define function
np_array = np.random.rand(100,100,100)
my_function = lambda x: x / np.sum(x)
# Option 1: Loop over array and apply function
serial_result = np_array.copy()
for i in range(np_array.shape[2]):
serial_result[:,:,i] = my_function(np_array[:,:,i])
Now using parallel execution with joblib
:
# Option 2: Parallel execution
# ... Apply function in Parallel
from joblib import delayed, parallel
sub_arrays = Parallel(n_jobs=6)( # Use 6 cores
delayed(my_function)(np_array[:,:,i]) # Apply my_function
for i in range(np_array.shape[2])) # For each 3rd dimension
# ... Concatenate the list of returned arrays
parallel_results = np.stack(sub_arrays, axis=2)
# Compare results
np.equal(serial_result, parallel_results).all() # True
Upvotes: 1