Marziyeh Askari
Marziyeh Askari

Reputation: 265

RuntimeError:freeze_support() on Mac

I'm new on python. I want to learn how to parallel processing in python. I saw the following example:


import multiprocessing as mp

np.random.RandomState(100)
arr = np.random.randint(0, 10, size=[20, 5])
data = arr.tolist()

def howmany_within_range_rowonly(row, minimum=4, maximum=8):
    count = 0
    for n in row:
    if minimum <= n <= maximum:
        count = count + 1
    return count

pool = mp.Pool(mp.cpu_count())

results = pool.map(howmany_within_range_rowonly, [row for row in data])

pool.close()

print(results[:10])

but when I run it, this error happened:

RuntimeError: 
    An attempt has been made to start a new process before the
    current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

What should I do?

Upvotes: 15

Views: 11007

Answers (2)

Oliver K&#252;chler
Oliver K&#252;chler

Reputation: 346

I had a live example, where I faced the same RuntimeError issue when I executed a specific tool on MacOS-machines (on Linux machines it was fine though). However, I'm not sure about the exact cause for the problem, cause the if __name__ == "__main__" encapsulation seemed to be properly at place.

Following one comment on this Stack-Overflow entry, I suspected that using python>=3.8, which utilizes spawn as default method for calling subprocesses might be the problem.

My solution: Using python=3.7 did the trick.

Upvotes: 1

dspencer
dspencer

Reputation: 4471

If you place everything in global scope inside this if __name__ == "__main__" block as follows, you should find that your program behaves as you expect:

def howmany_within_range_rowonly(row, minimum=4, maximum=8):
    count = 0
    for n in row:
    if minimum <= n <= maximum:
        count = count + 1
    return count

if __name__ == "__main__":   
    np.random.RandomState(100)
    arr = np.random.randint(0, 10, size=[20, 5])
    data = arr.tolist()  
    pool = mp.Pool(mp.cpu_count())

    results = pool.map(howmany_within_range_rowonly, [row for row in data])

    pool.close()

    print(results[:10])

Without this protection, if your current module was imported from a different module, your multiprocessing code would be executed. This could occur within a non-main process spawned in another Pool and spawning processes from sub-processes is not allowed, hence we protect against this problem.

Upvotes: 19

Related Questions