Reputation: 311
The code I use without multiprocessing is as below, it time it spend is 0:00:03.044280:
def execute_it():
number = 10000000
listing_1 = range(number)
listing_2 = range(number)
listing_3 = range(number)
start = datetime.now()
task(listing_1, listing_2, listing_3)
print datetime.now() - start
def task(listing_1, listing_2, listing_3):
for l1, l2, l3 in zip(listing_1, listing_2, listing_3):
l1 + l2 + l3
I want to use multiprocessing to spend less time, the code I tried is as below:
def execute_it():
number = 10000000
listing_1 = list(range(number))
listing_2 = list(range(number))
listing_3 = list(range(number))
params = zip(listing_1, listing_2, listing_3)
start = datetime.now()
pool = mp.Pool(processes=5)
pool.map(task, params)
pool.close()
print datetime.now() - start
def task(params):
params[0] + params[1] + params[2]
it spend 0:00:15.654919 !!!
What is wrong in my code? I am sure the thing they do is same.
Upvotes: 2
Views: 51
Reputation: 28703
The multiprocessing version takes longer because it is effectively the same as the single-process version plus some additional stuff like creating processes and running map.
You can replace zip with itertools.izip
and mp.map
with mp.imap
to get the expected parallelism effect, otherwise all the heavy processing will happen in the main process.
from itertools import izip
...
def execute_it():
number = 10000000
listing_1 = list(range(number))
listing_2 = list(range(number))
listing_3 = list(range(number))
params = izip(listing_1, listing_2, listing_3)
start = datetime.now()
pool = mp.Pool(processes=5)
pool.imap(task, params)
pool.close()
print datetime.now() - start
Upvotes: 1