Filibuster
Filibuster

Reputation: 81

Lock objects should only be shared between processes through inheritance

I am using the multiprocessing.Pool class within an object and attempting the following:

from multiprocessing import Lock, Pool

class A:
    def __init__(self):
        self.lock = Lock()
        self.file = open('test.txt')
    def function(self, i):
        self.lock.acquire()
        line = self.file.readline()
        self.lock.release()
        return line
    def anotherfunction(self):
        pool = Pool()
        results = pool.map(self.function, range(10000))
        pool.close()
        pool.join()
        return results

However I am getting a runtime error stating that lock objects should only be shared between processes through inheritance. I am fairly new to Python and multiprocessing. How can I get put on the right track?

Upvotes: 8

Views: 17700

Answers (1)

Booboo
Booboo

Reputation: 44313

multiprocessing.Lock instances can be attributes of multiprocessing.Process instances. When a process is created in the main process with a lock attribute, the lock exists in the main process’s address space. When the process’s start method is invoked and runs a subprocess which invokes the process’s run method, the lock has to be serialized/deserialized to the subprocess address space. This works as expected:

from multiprocessing import Lock, Process

class P(Process):
    def __init__(self, *args, **kwargs):
        Process.__init__(self, *args, **kwargs)
        self.lock = Lock()
    def run(self):
        print(self.lock)

if __name__ == '__main__':
    p = P()
    p.start()
    p.join()

Prints:

<Lock(owner=None)>

Unfortuantely, this does not work when you are dealing with multiprocessing.Pool instances. In your example, self.lock is created in the main process by the __init__ method. But when Pool.map is called to invoke self.function, the lock cannot be serialized/deserialized to the already-running pool process that will be running this method.

The solution is to initialize each pool process with a global variable set to this lock (there is no point in having this lock being an attribute of the class now). The way to do this is to use the initializer and initargs parameters of the pool __init__ method. See the documentation:

from multiprocessing import Lock, Pool

def init_pool_processes(the_lock):
    '''Initialize each process with a global variable lock.
    '''
    global lock
    lock = the_lock

class Test:
    def function(self, i):
        lock.acquire()
        with open('test.txt', 'a') as f:
            print(i, file=f)
        lock.release()
    def anotherfunction(self):
        lock = Lock()
        pool = Pool(initializer=init_pool_processes, initargs=(lock,))
        pool.map(self.function, range(10))
        pool.close()
        pool.join()

if __name__ == '__main__':
    t = Test()
    t.anotherfunction()

Upvotes: 16

Related Questions