sidewaiise
sidewaiise

Reputation: 1465

How to split python work between cores? (Multiprocessing lib)

I have a sequential function that sorts through lists and performs tasks. For example... (this is not the actual code but is analagous)

def myFunction(list):
   for item in list:
      sublist_a=item[0]
      sublist_b=item[1]
      sublist_c=item[2]
      sublist_d=item[3]
   for row in sublist_a:
      #(do tasks....)
   for row in sublist_b:
      #(do tasks....)
   for row in sublist_c:
      #(do tasks....)
   for row in sublist_d:
      #(do tasks....)
   print "COMPLETE"

So this is overly simplified, but essentially these lists are quire large, and the order of execution is important (ie. for row in ....), so I would like to split them between the available cores on my system.

Could someone please suggest a method for doing so?

Have never used the Multiprocessing library but it seems this is probably the best to use with python.

Upvotes: 1

Views: 5784

Answers (1)

Kijewski
Kijewski

Reputation: 26022

You are looking for a multiprocessing.Pool

from multiprocessing import Pool

def function_to_process_a(row):
    return row * 42 # or something similar

# replace 4 by the number of cores that you want to utilize
with Pool(processes=4) as pool:
    # The lists are processed one after another,
    # but the items are processed in parallel.
    processed_sublist_a = pool.map(function_to_process_a, sublist_a)
    processed_sublist_b = pool.map(function_to_process_b, sublist_b)
    processed_sublist_c = pool.map(function_to_process_c, sublist_c)
    processed_sublist_d = pool.map(function_to_process_d, sublist_d)

Edit: As sidewaiise pointed out in the comments, it is preferable to use this pattern:

from contextlib import closing, cpu_count, Pool

with closing(Pool(processes=cpu_count())) as pool
    pass # do something

Upvotes: 4

Related Questions