ptrck
ptrck

Reputation: 79

Multiprocessing a member function

I have a list of objects and need to call a member function of every object. Is it possible to use multiprocessing for that?

I wrote a short example of what i want to do.

import multiprocessing as mp

class Example:
    data1 = 0
    data2 = 3

    def compute():
        self.val3 = 6


listofobjects = []

for i in range(5):
    listofobjects.append(Example())

pool = mp.Pool()
pool.map(listofobjects[range(5)].compute())

Upvotes: 1

Views: 3033

Answers (1)

Mad Physicist
Mad Physicist

Reputation: 114518

There are two conceptual issues that @abarnert has pointed out, beyond the syntactic and usage problems in your "pseudocode". The first is that map works with a function that is applied to the elements of your input. The second is that each sub-process gets a copy of you object, so changes to attributes are not automatically seen in the originals. Both issues can be worked around.

To answer your immediate question, here is how you would apply a method to your list:

with mp.Pool() as pool:
    pool.map(Example.compute, listofobjects)

Example.compute is an unbound method. That means that it is just a regular function that accepts self as a first argument, making it a perfect fit for map. I also use the pool as a context manager, which is recommended to ensure that cleanup is done properly whether or not an error occurs.

The code above would not work because the effects of compute would be local to the subprocess. The only way to pass them back to the original process would be to return them from the function you passed to map. If you don't want to modify compute, you could do something like this:

def get_val3(x):
    x.compute()
    return x.val3

with mp.Pool() as pool:
    for value, obj in zip(pool.map(get_val3, listofobjects), listofobjects):
        obj.val3 = value

If you were willing to modify compute to return the object it is operating on (self), you could use it to replace the original objects much more efficiently:

class Example:
    ...
    def compute():
        ...
        return self

with mp.Pool() as pool:
    listofobjects = list(pool.map(Example.compute, listofobjects))

Update

If your object or some part of its reference tree does not support pickling (which is the form of serialization normally used to pass objects between processes), you can at least get rid of the wrapper function by returning the updated value directly from compute:

class Example:
    ...
    def compute():
        self.val3 = ...
        return self.val3

with mp.Pool() as pool:
    for value, obj in zip(pool.map(Example.compute, listofobjects), listofobjects):
        obj.val3 = value

Upvotes: 3

Related Questions