Rain
Rain

Reputation: 375

python for-loop parallelization using multiprocessing.pool

I have a piece of code that looks like this:

def calc_stuff(x,a,b,c):
    ...
    return y
x = range(N)
y = zeros(x.shape)
if __name__ == '__main__':
    p = Pool(nprocs)
    y = p.map(calc_stuff,x,a,b,c)

This does not work, and as I searched online, it is because the map function deals with iterables rather than argument lists. I wonder what's the simplest method to modify this code to parallelize it, i.e., x is the array/iterable that I want to parallelize.

Thank you.

Upvotes: 1

Views: 1431

Answers (2)

Blckknght
Blckknght

Reputation: 104712

One option is to use itertools.repeat with zip (or itertools.izip) to build your multiple arguments into an iterable of tuples and then use multiprocessing.Pool.starmap to call the function with the tuple unpacked as arguments:

from itertools import repeat

if __name__ == '__main__':
    p = Pool(nprocs)
    y = p.starmap(calc_stuff, zip(x, repeat(a), repeat(b), repeat(c)))

Upvotes: 2

user707650
user707650

Reputation:

Have a look at functools.partial, which can turn the function with those required arguments that you don't want to iterate over, into a new function.

from multiprocessing import Pool
import functools

def calc_stuff(a,b,c, x):
    return x+a+b+c

N = 10
x = list(range(N))
a = 1
b = 2
c = 3
if __name__ == '__main__':
    nprocs = 4
    p = Pool(nprocs)
    calc_stuff_p = functools.partial(calc_stuff, a, b, c)
    y = p.map(calc_stuff_p, x)
print(y)

results in

[6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

Note that your iterable x now comes last in calc_stuff, since the function returned by partial appends any arguments to the existing list of arguments.

Upvotes: 0

Related Questions