Reputation: 235
I have a multiprocessing Lock, that I define as
import multiprocessing
lock1 = mp.Lock()
To share this lock among the different child processes I do:
def setup_process(lock1):
global lock_1
lock_1 = lock1
pool = mp.Pool(os.cpu_count() - 1,
initializer=setup_process,
initargs=[lock1])
Now I've noticed that if the processes call the following function, and the function is defined in the same python module (i.e., same file):
def test_func():
print("lock_1:", lock_1)
with lock_1:
print(str(mp.current_process()) + " has the lock in test function.")
I get an output like:
lock_1 <Lock(owner=None)>
<ForkProcess name='ForkPoolWorker-1' parent=82414 started daemon> has the lock in test function.
lock_1 <Lock(owner=None)>
<ForkProcess name='ForkPoolWorker-2' parent=82414 started daemon> has the lock in test function.
lock_1 <Lock(owner=None)>
<ForkProcess name='ForkPoolWorker-3' parent=82414 started daemon> has the lock in test function.
However, if test_function is defined in a different file, the Lock is not recognized, and I get:
NameError:
name 'lock_1' is not defined
This seems to happen for every function, where the important distinction is whether the function is defined in this module or in another one. I'm sure I'm missing something very obvious with the global variables, but I'm new to this and I haven't been able to figure it out. How can I make the Locks be recognized everywhere?
Upvotes: 1
Views: 221
Reputation: 11075
Well, I learned something new about python today: global
isn't actually truly global. It only is accessible at the module scope.
There are a multitude of ways of sharing your lock with the module in order to allow it to be used, and the docs even suggest a "canonical" way of sharing globals between modules (though I don't feel it's the most appropriate for this situation). To me this situation illustrates one of the short fallings of using globals in the first place, though I have to admit in the specific case of multiprocessing.Pool
initializers it seems to be the accepted or even intended use case to use globals to pass data to worker functions. It actually makes sense that globals can't cross module boundaries because that would make the separate module 100% dependent on being executed by a specific script, so it can't really be considered a separate independent library. Instead it could just be included in the same file. I recognize that may be at odds with splitting things up not to create re-usable libraries but simply just to organize code logically in shorter to read segments, but that's apparently a stylistic choice by the designers of python.
To solve your problem, at the end of the day, you are going to have to pass the lock to the other module as an argument, so you might as well make test_func
recieve lock_1
as an argument. You may have found however that this will cause a RuntimeError: Lock objects should only be shared between processes through inheritance
message, so what to do? Basically, I would keep your initializer, and put test_func
in another function which is in the __main__
scope (and therefore has access to your global lock_1
) which grabs the lock, and then passes it to the function. Unfortunately we can't use a nicer looking decorator or a wrapper function, because those return a function which only exists in a local scope, and can't be imported when using "spawn" as the start method.
from multiprocessing import Pool, Lock, current_process
def init(l):
global lock_1
lock_1 = l
def local_test_func(shared_lock):
with shared_lock:
print(f"{current_process()} has the lock in local_test_func")
def local_wrapper():
global lock_1
local_test_func(lock_1)
from mymodule import module_test_func #same as local_test_func basically...
def module_wrapper():
global lock_1
module_test_func(lock_1)
if __name__ == "__main__":
l = Lock()
with Pool(initializer=init, initargs=(l,)) as p:
p.apply(local_wrapper)
p.apply(module_wrapper)
Upvotes: 1