thedavincicoder
thedavincicoder

Reputation: 17

Python Multiprocessing Looping Python File Instead of Starting Process

I'm trying to get started with multiprocessing, and I'm running into some interesting issues. The code I'm using is below (for the record, this example is straight from the multiprocessing documentation):

from multiprocessing import Process

def f(name):
     print('hello', name)

if __name__ == '__main__':

     p = Process(target=f, args=('bob'))
     p.start()
     p.join()

This works fine, and prints "hello bob" as it should. When I add any additional code to the file though, before or after the if statement, then p does not evaluate, and my file loops back to the beginning and runs all over again endlessly. For example, the following code gives me this issue:

from multiprocessing import Process

def f(name):
     print('hello', name)

if __name__ == '__main__':

     p = Process(target=f, args=('bob'))
     p.start()
     p.join()

test_input = input("test input")

I am running Python using Windows 10, Pycharm v. 2021.3.2, and Python 3.10.0. Is this an issue that any of you have seen before? At this point I'm starting to wonder if perhaps it's even an issue between Windows and Pycharm or Windows and Python, or maybe just a case of inexperience on my part.

Thank you!

Upvotes: 0

Views: 591

Answers (3)

ShadowRanger
ShadowRanger

Reputation: 155654

That if __name__ == '__main__': guard is important. On systems that don't use fork, it simulates a fork by importing the main script in each worker process without naming it __main__ (it's named __mp_main__ IIRC). Any code that should only run in the "main" script needs to be protected by that guard (it can be indirectly, by defining a function and calling it within the guarded segment; the function will be defined in the workers, but not run).

So to fix this, all you need to do is indent the test_input = input("test input") so it's protected by the if __name__ == '__main__': guard. In real code, I try to keep the guarded section clean (so I can't accidentally write functions that rely on global state that doesn't exist when it's not run as the main script, and for the mild performance benefits of using function locals over globals), so I'd write it like:

from multiprocessing import Process

def f(name):
    print('hello', name)

def main():
    p = Process(target=f, args=('bob',))
    p.start()
    p.join()

    test_input = input("test input")

if __name__ == '__main__':
    main()

but that's not strictly necessary.

Upvotes: 1

Booboo
Booboo

Reputation: 44293

I thought I would elaborate on ShadowRanger's answer:

On Windows systems new subprocesses are created by the following steps:

  1. A new process is created wherein the Python interpreter is re-launched.
  2. The Python interpreter re-interprets the current source program executing everything that is at global scope in order to compile function definitions, initialize global variables, etc.
  3. Finally, your worker function, f in this case, is invoked with memory thus initialized.

The reason for placing the code that creates the subprocess within a block that is governed by if __name__ == '__main__': is that if you didn't, then because of Step 2 above you would get into a recursive, infinite loop creating new subprocesses ad inifinitum. The key point is that only in the main function will variable __name__ have the value '__main__'; it will have a different value for any subprocess that is created. And so the code that creates the new subprocess, i.e. p = Process(target=f, args=('bob',)), will not be executed as part of the initialization of the subprocess.

Your problem arises from the statement test_input = input("test input") being at global scope and not being within a if __name__ == '__main__': block and so it will be executed as part of the initialization of the subprocess. So your worker function, f, will not run until this prompt for input is satisfied and then when it returns your main process will be putting out the prompt again. Anyway, this is what I see when the program is run from a Windows command prompt. Perhaps with PyCharm there is a restriction against doing the input statement from any thread other than the main thread. But even if an exception is being thrown from that statement in creating the subprocess, I still don't quite see how your program would be looping continuously. Unfortunately, I do not have PyCharm installed.

Upvotes: 1

Furkan Ozalp
Furkan Ozalp

Reputation: 334

Regarding to ShadowRanger answer, I think you should also put comma after 'bob'. According to https://docs.python.org/3/library/multiprocessing.html

P should be like this if you want to put another statement.

  p = Process(target=f, args=('bob',))

Upvotes: 0

Related Questions