berberto
berberto

Reputation: 101

ProcessPoolExecutor does not start

I am working in a Jupyter notebook. I'm new to multiprocessing in python, and I'm trying to parallelize the calculation of a function for a grid of parameters. Here is a snippet of code quite representative of what I'm doing:

import os
import numpy as np
from concurrent.futures import ProcessPoolExecutor

def f(x,y):
    print(os.getpid(), x,y,x+y)
    return x+y

xs = np.linspace(5,7,3).astype(int)
ys = np.linspace(1,3,3).astype(int)

func = lambda p: f(*p)
with ProcessPoolExecutor() as executor:
    args = (arg for arg in zip(xs,ys))
    results = executor.map(func, args)
    
for res in results:
    print(res)

The executor doesn't even start.

No problem whatsoever if I serially execute the same with, e.g. list comprehension,

args = (arg for arg in zip(xs,ys))
results = [func(arg) for arg in args]

Upvotes: 1

Views: 2089

Answers (1)

brensnap
brensnap

Reputation: 139

Are you running on Windows? I think your main problem is that each process is trying to re-execute your whole script, so you should include an if name == "main" check. I think you have a second issue trying to use a lambda function that can't be pickled, since the processes communicate by pickling the data. There are work-arounds for that but in this case it looks like you don't really need the lambda. Try something like this:

import os
import numpy as np
from concurrent.futures import ProcessPoolExecutor


def f(x, y):
    print(os.getpid(), x, y, x + y)
    return x + y

if __name__ == '__main__':

    xs = np.linspace(5, 7, 3).astype(int)
    ys = np.linspace(1, 3, 3).astype(int)

    with ProcessPoolExecutor() as executor:
        results = executor.map(f, xs, ys)

    for res in results:
        print(res)

Upvotes: 2

Related Questions