pingboing
pingboing

Reputation: 69

Why multiprocessing.Process does not work here?

I am testing multiprocessing on jupyter notebook and spyder:

import multiprocessing
import time

start = time.perf_counter()

def do_something():
    print(f'Sleeping 5 second(s)...')
    time.sleep(5)
    print(f'Done Sleeping...') 


p2 = multiprocessing.Process(target = do_something)
p3 = multiprocessing.Process(target = do_something)

p2.start()
p3.start()

p2.join()
p3.join()

finish = time.perf_counter()

print(f'Finished in {round(finish-start, 2)} secounds')

And I got:

Finished in 0.12 secounds

This is much shorter than 5 seconds. I did test the do_something function and it seems fine. I feel like in above code, the do_someting function was not even executed...

start = time.perf_counter()

def do_something(seconds):
    print(f'Sleeping {seconds} second(s)...')
    time.sleep(seconds)
    print(f'Done Sleeping...{seconds}') 

do_something(5)

finish = time.perf_counter()

print(f'Finished in {round(finish-start, 2)} secounds')


Sleeping 5 second(s)...
Done Sleeping...5
Finished in 5.0 secounds

Upvotes: 0

Views: 1634

Answers (1)

feddg
feddg

Reputation: 36

Your code should throw an error (I won't write the traceback to keep the answer short):

RuntimeError: 
    An attempt has been made to start a new process before the
    current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

Long story short: the multiprocessing package is unable to correctly understand and execute your code. You should keep the definitions at the beginning of the file, and put the code you want to execute inside the

if __name__ == '__main__':

Otherwise, each new process will try to execute the same file (and spawn other processes, as well). The corrected code takes about 5.22 seconds to complete on my pc.

The need for the "if" is explained in the programming guidelines (section "Safe importing of main module") of the multiprocessing package. Be sure to read them to avoid an unwanted behaviour: multithreading and multiprocessing are prone to elusive bugs when not used correctly.

Here is the corrected code:

import multiprocessing
import time

def do_something():
    print('Sleeping 5 seconds...')
    time.sleep(5)
    print('Done Sleeping.') 

if __name__ == '__main__':
    start = time.perf_counter()

    p2 = multiprocessing.Process(target=do_something, args=())
    p3 = multiprocessing.Process(target=do_something, args=())

    p2.start()
    p3.start()

    p2.join()
    p3.join()

    finish = time.perf_counter()

    print(f'Finished in {round(finish-start, 2)} seconds')

Why do you see the output after 0.12 seconds? This happens because each child process throws its error and crashes (you should get two identical runtime errors), then the parent process is able to complete.

Upvotes: 2

Related Questions