Reputation: 138
I am parallelizing my simulation with Multiprocessing.pool, however i cannot pass a type 'module' in the pool.map which gives pickling error, i need to use that argument in the target function of pool.map to parallelize my code. In the function sample(), the fourth argument 'sim' is of type 'Module', so i cannot pass it with p.map(), since it cannot be iterated, but i need that argument in the function parallel(), which should be used as
model=sim.simulate(modelname, packname, config)
But currently i am importing that module statically and calling in the function parallel() as
model=OpenModelica.simulate(modelname, packname, config)
Currently my code looks like this, is there way to declare the argument 'sim' in function sample() as global and access it in the target function parallel().
def sample(file,model,config,sim,resultDir,deleteDir):
from multiprocessing import Pool
p=Pool()
p.map(parallel,zip(file,model,dirs,resultpath,config))
def parallel(modellists):
packname=[]
packname.append(modellists[0])
modelname=modellists[1]
dirname=modellists[2]
path=modellists[3]
config=modellists[4]
os.chdir(dirname)
model=OpenModelica.Model(modelname, packname, config)
Upvotes: 2
Views: 1294
Reputation: 35247
This is because pickle
is unable to serialize a module, and thus multiprocessing
can't pass the module through the map
. However, if you use a fork of multiprocessing
called pathos.multiprocessing
, it works. This is because pathos
uses the dill
serializer, which can pickle modules.
>>> import dill
>>> import numpy
>>>
>>> from pathos.multiprocessing import ProcessingPool as Pool
>>> p = Pool()
>>>
>>> def getname(x):
... return getattr(x, '__package__', None)
...
>>> p.map(getname, [dill, numpy])
['dill', 'numpy']
It also works for multiple arguments, so it's a bit more natural than having to zip all the arguments -- and has asynchronous and iterative maps as well.
>>> packname = list('abcde')
>>> modelname = list('ABCDE')
>>> dirname = list('12345')
>>> config = [1,2,3,4,5]
>>> import math
>>> f = [math.sin, math.cos, math.sqrt, math.log, math.tan]
>>>
>>> def parallel(s1, s2, si, i, f):
... s = (s1 + s2).lower().count('b')
... return f(int(si) + i - s)
...
>>> res = p.amap(parallel, packname, modelname, dirname, config, f)
>>> print "asynchronous!"
'asynchronous!'
>>> res.get()
[0.9092974268256817, -0.4161468365471424, 2.449489742783178, 2.0794415416798357, 0.6483608274590867]
Get pathos
here: https://github.com/uqfoundation
Upvotes: 2