Reputation: 197
This question is related to a previous question I asked, and it seems like a simple question, but I have a hard time finding useful information or tutorials about the topic of multiprocessing.
My problem is that I would like to combine the produced data into one big array and then store it in my hdf file.
def Simulation(i, output):
# make a simulation which outputs it resutlts in A. with shape 4000,3
A = np.array([4000,3])
output.put(A)
def handle_output(output):
hdf = pt.openFile('simulation.h5',mode='w')
hdf.createGroup('/','data')
# Here the output should be joined somehow.
# I would like to get it in the shape [4000,3,10]
output.get()
hdf.createArray('/data','array',A)
hdf.close()
if __name__ == '__main__':
output = mp.Queue()
jobs = []
proc = mp.Process(target=handle_output, args=(output, ))
proc.start()
for i in range(10):
p = mp.Process(target=Simulation, args=(i, output))
jobs.append(p)
p.start()
for p in jobs:
p.join()
output.put(None)
proc.join()
Upvotes: 8
Views: 15162
Reputation: 8548
What you really need is a multiprocessing Pool
Just do something like:
def Simulation(i):
return output
p = mp.Pool(16)
result = p.map(Simulation,range(10))
result = np.array(result).reshape(...)
p.close()
p.join()
Upvotes: 12