ch0l1n3
ch0l1n3

Reputation: 295

Parallel computing in python significantly slower than regular for loop

So I'm trying to do some simple image analysis in python, I have a numpy array of the video in question and it has a shape of (930, 256, 256), i.e. 930 frames of a resolution of 256 by 256 pixels.

I'm trying to do seed pixel correlation in parallel, my computer has 12 cores, so I should be able to write a parallel for loop and get my results faster.

This is what I came up with after looking around for ways to write parallel for loops. However, it's significantly slower than the non parallel version!!

Perhaps someone can tell me a better way of writing it? (using other libraries!) Or maybe someone can tell me why it is slower?

Here's the code I came up with:

import numpy as np
from scipy.stats.stats import pearsonr
from joblib import Parallel, delayed  
import multiprocessing

def corr(pixel, seed_pixel):
    return pearsonr(pixel, seed_pixel)[0]

def get_correlation_map(seed_x, seed_y, frames):
    seed_pixel = np.asarray(frames[:, seed_x, seed_y], dtype=np.float32)

    # Reshape into time and space
    frames = np.reshape(frames, (total_number_of_frames, width*height))
    #correlation_map = []
    #####################################
    print 'Getting correlation...'

    # The parallel version.
    correlation_map = Parallel(n_jobs=12)(delayed(corr)(pixel, seed_pixel) for pixel in frames.T)

    # Non parallel version
    #correlation_map = []
    #for i in range(frames.shape[-1]):
        #correlation_map.append(pearsonr(frames[:, i], seed_pixel)[0])
    #####################################
    correlation_map = np.asarray(correlation_map, dtype=np.float32)
    correlation_map = np.reshape(correlation_map, (width, height))
    print np.shape(correlation_map)

    return correlation_map

All I need is a way to parallelize a for loop that will append its results to a list in the order of the iteration. So I suppose synchronization could be an issue!

Upvotes: 3

Views: 2568

Answers (1)

Victory
Victory

Reputation: 5890

You are likely having an issue because the arguments passed to Parallel are large and all being serialized. You can use backend="threading" to avoid this if (as i assume) personr releases the GIL. Otherwise you might have to look into numpy.memmap and stick with using multiprocessor

correlation_map = Parallel(n_jobs=12, backend="threading")(delayed(corr)(pixel, seed_pixel) for pixel in frames.T)

Upvotes: 2

Related Questions