Reputation: 1032
My program takes several arguments where one of them is called challenges
which receives integer value from the command line. I want to use multiprocessing
by passing the value of challenges
to a self-defined method generation
:
import multiprocessing
gen = generator.GenParentClass()
mlp = multiprocessing.Pool(processes=multiprocessing.cpu_count())
X, y = mlp.imap_unordered(gen.generation, [args.challenges])
The method generation
in class GenParentClass
has this simple signature:
def generation(self, num):
#some stuff
However, I get this error:
Traceback (most recent call last):
File "experiments.py", line 194, in <module>
X, y = mlp.imap_unordered(gen.generation, [args.challenges])
File "/anaconda/lib/python2.7/multiprocessing/pool.py", line 668, in next
raise value
cPickle.PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed
I don't know how to solve this problem. Everything seems to me correct!! Any help os appreciated.
Upvotes: 0
Views: 1091
Reputation: 463
The multiprocess
module serializes (pickles
) the arguments to imap_unordered
. It appears that the function gen.generation
is an instance method (defined within a class), which means it cannot be pickled, hence your error.
Edit: Here is a possible workaround that defines the function outside of the class, and adds additional argument(s) to that function, which are filled in using partial
from itertools
:
import multiprocessing
from functools import partial
class GenParentClass(object):
a = None
b = None
def __init__(self, a, b):
self.a = a
self.b = b
# define this outside of GenParentClass (i.e., top level function)
def generation(num, x, y):
return x+y+num
gen = GenParentClass(3, 5)
mlp = multiprocessing.Pool(processes=multiprocessing.cpu_count())
R = mlp.imap_unordered(partial(generation, x=gen.a, y=gen.b), [1,2,3])
print([r for r in R]) # prints "[9, 10, 11]"
More information on pickle-ability is available here.
More information on functools is available here.
Edit 2: If you use multiprocess.Pool
and the function definition uses qualified variable names self.a
and self.b
, you can do this without rewriting the function outside the class, but you won't be able to retrieve the output, and the state of gen2 will not change (defeating the purpose of calling the function at all).
gen2 = GenParentClass(4, 6)
p = {}
for key in range(5):
p[key] = multiprocessing.Process(target = GenParentClass.altgen, args = (gen2, key,))
p[key].start()
for key in p:
p[key].join()
Upvotes: 1