Reputation: 1465
I have a sequential function that sorts through lists and performs tasks. For example... (this is not the actual code but is analagous)
def myFunction(list):
for item in list:
sublist_a=item[0]
sublist_b=item[1]
sublist_c=item[2]
sublist_d=item[3]
for row in sublist_a:
#(do tasks....)
for row in sublist_b:
#(do tasks....)
for row in sublist_c:
#(do tasks....)
for row in sublist_d:
#(do tasks....)
print "COMPLETE"
So this is overly simplified, but essentially these lists are quire large, and the order of execution is important (ie. for row in ....
), so I would like to split them between the available cores on my system.
Could someone please suggest a method for doing so?
Have never used the Multiprocessing library but it seems this is probably the best to use with python.
Upvotes: 1
Views: 5784
Reputation: 26022
You are looking for a multiprocessing.Pool
from multiprocessing import Pool
def function_to_process_a(row):
return row * 42 # or something similar
# replace 4 by the number of cores that you want to utilize
with Pool(processes=4) as pool:
# The lists are processed one after another,
# but the items are processed in parallel.
processed_sublist_a = pool.map(function_to_process_a, sublist_a)
processed_sublist_b = pool.map(function_to_process_b, sublist_b)
processed_sublist_c = pool.map(function_to_process_c, sublist_c)
processed_sublist_d = pool.map(function_to_process_d, sublist_d)
Edit: As sidewaiise pointed out in the comments, it is preferable to use this pattern:
from contextlib import closing, cpu_count, Pool
with closing(Pool(processes=cpu_count())) as pool
pass # do something
Upvotes: 4