Reputation: 1
I am new to python, and I want to use pool.apply_async()
to calibrate my code. The parameters of pool.apply_async()
confused me.
Here is my code:
def detect(i, pdf):
savefig2pdf.save(event['value'][0][5000:6000],
event['value'][1][5000:6000],
event['value'][2][5000:6000],
event['start point index']+5000 ,
eventlist[i],
p_result,
s_arrival,
pdf)"
if __name__ == '__main__':
pdf = PdfPages('cut_figure.pdf')
pool = multiprocessing.Pool(processes=10) # set the processes max number 10
for i in range(0, len(eventlist)):
pool.apply_async(detect, (i, pdf,))
pool.close()
pool.join()
pdf.close()
If I only pass the i
, it works. How can I also pass the pdf to processes? I need the pdf to be able to write until all the process is done.
Thanks for your help.
Upvotes: 0
Views: 152
Reputation: 4467
The multiprocessing
module relies on pickle
to serialize the object you pass between your functions. But you cannot pickle the pdf
object:
>>> from matplotlib.backends.backend_pdf import PdfPages
>>> import pickle
>>> pdf = PdfPages('cut_figure.pdf')
>>> pickle.dumps(pdf)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-5-e06adaa58666> in <module>()
----> 1 pickle.dumps(pdf)
TypeError: cannot serialize '_io.BufferedWriter' object
So it is not possible to use multiprocessing
with a single pdf
object. You can try using threading
to get multi threaded execution, as your program seems to be IO bound (you spend a lot of time writing to a file).
Upvotes: 2