Reputation: 3
I use Multiprocessing library in Python to distribute a function over multiple cores. To do that I use "Pool" function, but I want to know when each processor has completed its work.
Here is the code :
def parallel(m,G):
D=0
for i in xrange(G):
D+=random()
return 1*(D<1)
pool=Pool()
TOTAL=0
for i in xrange(10):
TOTAL += sum(pool.map(partial(parallel,G=2),xrange(100)))
print TOTAL
I know how to use time.time() in normal situation, but what I need is to know when each core has completed is part of the job. If I put a time stamp directly in the function I will get many time values without knowing on what core it is processed.
Any advice is welcome!
Upvotes: 0
Views: 1030
Reputation: 2035
You may return the completion time along with the actual result from parallel
and then pick the last timestamp for each worker.
import time
from random import random
from functools import partial
from multiprocessing import Pool, current_process
def parallel(m, G):
D = 0
for i in xrange(G):
D += random()
# uncomment to give the other workers more chances to run
# time.sleep(.001)
return (current_process().name, time.time()), 1 * (D < 1)
# don't deny the existence of Windows
if __name__ == '__main__':
pool = Pool()
TOTAL = 0
proc_times = {}
for i in xrange(5):
# times is a list of proc_name:timestamp pairs
times, results = zip(*pool.map(partial(parallel, G=2), xrange(100)))
TOTAL += sum(results)
# process_times_loc is guaranteed to hold the last timestamp
# for each proc_name, see the doc on dict
proc_times_loc = dict(times)
print 'local completion times:', proc_times_loc
proc_times.update(proc_times_loc)
print TOTAL
print 'total completion times:', proc_times
However when jobs are that simple you may find that calling time.time
each time consumes too much of CPU time.)
Upvotes: 1