Felix P.
Felix P.

Reputation: 73

Cannot use result from Multiprocess Pool directly

I have the following example code:

def my_function_caller():
    samples = []
    for t in range(2):
        samples.append(my_function(t))
    return samples

def my_function(t):
    results = []
    if __name__ == '__main__':
        pool = Pool()
        results = pool.map(task, range(5))
        pool.close()
        pool.join()
    A = results[0]
    return A


def task(k):
    time.sleep(1)
    result = k
    return result

When I call my_function(t), I get the following error:

    A = results[0]
IndexError: list index out of range

I expected pool.close() and pool.join() to make the program wait for all processes to finish so that I could then use the jointly computed result "results" afterwards. How can I force the program to wait or more generally, how can I directly use "results" in the function "my_function"?

EDIT: To recreate the error: This is the complete code that I am running (simply copied and pasted). The python file called main.py is located in a standard Python project and I am using Windows.

from multiprocessing import Pool
import time

def my_function_caller():
    samples = []
    for t in range(2):
        samples.append(my_function(t))
    return samples

def my_function(t):
    results = []
    if __name__ == '__main__':
        pool = Pool()
        results = pool.map(task, range(5))
        pool.close()
        pool.join()
    A = results[0]
    return A


def task(k):
    time.sleep(1)
    result = k
    return result

a = my_function_caller()

Maybe, as additional information, I get the error message

        A = results[0]
IndexError: list index out of range

several times, not just once.

Upvotes: 1

Views: 318

Answers (2)

swaggg
swaggg

Reputation: 480

It's not really my answer but I'm going to post it as an answer anyway. Windows displays some really messed up behaviour as mentioned here:

python multiprocessing on Windows

The process is supposed to only call your function, but it ends up executing the whole program all over again.

You have to prepend the entry point with if __name__ == "__main__":

if __name__ == "__main__":
   a = my_function_caller()

Separately, you should still use if __name__ == "__main__" or __name__ == "__mp_main__": in your threaded function, but either at the top or at least making sure the program won't try to access a non-existent value if being imported.

Upvotes: 2

Roman Pavelka
Roman Pavelka

Reputation: 4171

It worked for me on Linux. However, I consider the structure little bit messy, consider e.g. this to more easily debug your problem:

from multiprocessing import Pool
import time


def my_function_caller():
    samples = []
    for t in range(2):
        samples.append(my_function(t))
    return samples


def my_function(t):
    with Pool(5) as p:
        results = p.map(task, range(5))
    A = results[0]
    return A


def task(k):
    time.sleep(1)
    result = k
    return result


if __name__ == "__main__":
    a = my_function_caller()
    print(a)

Upvotes: 2

Related Questions