Multiprocessing large convolutions using Scipy no speed up

Question

I am using scipy.signal.correlate to perform large 2D convolutions. I have a large number of arrays that I want to operate on, and so naturally thought the multiprocessing.Poolcould help. However using the following simple setup (on a 4-core cpu) provides no benefit.

import multiprocessing as mp
import numpy as np
from scipy import signal

arrays = [np.ones([500, 500])] * 100
kernel = np.ones([30, 30])

def conv(array, kernel):
  return (array, kernel, signal.correlate(array, kernel, mode="valid", method="fft"))

pool = mp.Pool(processes=4)
results = [pool.apply(conv, args=(arr, kernel)) for arr in arrays]

Changing the process count to {1, 2, 3, 4} has approximately the same time (2.6 sec +- .2).

What could be going ?

Sevy · Accepted Answer

I think the problem is in this line:

results = [pool.apply(conv, args=(arr, kernel)) for arr in arrays]

Pool.apply is a blocking operation, running conv on each element in the array and waiting before going to the next element, so even though it looks like you are multiprocessing nothing is actually being distributed. You need to use pool.map instead to get the behavior you are looking for:

from functools import partial
conv_partial = partial( conv, kernel=kernel )
results = pool.map( conv_partial, arrays )

Multiprocessing large convolutions using Scipy no speed up

Answers (2)

Related Questions