Medo
Medo

Reputation: 1032

Cannot pass argument to method using multiprocessing.Pool

My program takes several arguments where one of them is called challenges which receives integer value from the command line. I want to use multiprocessing by passing the value of challenges to a self-defined method generation:

import multiprocessing

gen = generator.GenParentClass()
mlp = multiprocessing.Pool(processes=multiprocessing.cpu_count())
X, y = mlp.imap_unordered(gen.generation, [args.challenges])

The method generation in class GenParentClass has this simple signature:

def generation(self, num):
   #some stuff

However, I get this error:

Traceback (most recent call last):
  File "experiments.py", line 194, in <module>
    X, y = mlp.imap_unordered(gen.generation, [args.challenges])
  File "/anaconda/lib/python2.7/multiprocessing/pool.py", line 668, in next
    raise value
cPickle.PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

I don't know how to solve this problem. Everything seems to me correct!! Any help os appreciated.

Upvotes: 0

Views: 1091

Answers (1)

Ken
Ken

Reputation: 463

The multiprocess module serializes (pickles) the arguments to imap_unordered. It appears that the function gen.generation is an instance method (defined within a class), which means it cannot be pickled, hence your error.

Edit: Here is a possible workaround that defines the function outside of the class, and adds additional argument(s) to that function, which are filled in using partial from itertools:

import multiprocessing
from functools import partial

class GenParentClass(object):
    a = None
    b = None
    def __init__(self, a, b):
        self.a = a
        self.b = b

# define this outside of GenParentClass (i.e., top level function)
def generation(num, x, y):
    return x+y+num

gen = GenParentClass(3, 5)
mlp = multiprocessing.Pool(processes=multiprocessing.cpu_count())
R = mlp.imap_unordered(partial(generation, x=gen.a, y=gen.b), [1,2,3])
print([r for r in R])   # prints "[9, 10, 11]"

More information on pickle-ability is available here.

More information on functools is available here.

Edit 2: If you use multiprocess.Pool and the function definition uses qualified variable names self.a and self.b, you can do this without rewriting the function outside the class, but you won't be able to retrieve the output, and the state of gen2 will not change (defeating the purpose of calling the function at all).

gen2 = GenParentClass(4, 6)
p = {}
for key in range(5):
    p[key] = multiprocessing.Process(target = GenParentClass.altgen, args = (gen2, key,))
    p[key].start()

for key in p:
    p[key].join()

Upvotes: 1

Related Questions