python multiprocessing example itertools multiple lists

Question

I have a very simple application with a nested for-loop and it can take minutes to hours to run depending on the amount of data.

I got started with the multiprocessing lib in python. I tried implementing it in is most basic form, and even though my code runs, there are no performance gains. Leading me to believe I am implementing it incorrectly and/or the design of my code is extremely flawed.

My code is pretty straight forward:

import csv
import multiprocessing

somedata1 = open('data1.csv', 'r')
SD_data = csv.reader(data1,delimiter=',')
data1 = []
**import lots of CSV data***

def crunchnumbers():
   for i, vald1 in enumerate(data1):
        for i, vald2 in enumerate(data2):
            for i, vald3 in enumerate(data3):   
                for i, vald4 in enumerate(data3):
                    for i, vald5 in enumerate(data3):
                         sol = #add values
    print d_solution

if __name__ == '__main__':
    pool = multiprocessing.Pool(processes=4)
    pool.apply(crunchnumbers)

How can I do this with python's multiprocessing? (somehow spliting into chunks?) or is this a better job for jug? Based on suggestions on SO, I spent a few days trying to use Jug, but the number of iterations in my nested for-loops easily gets into the 10's of millions (and more) of very fast transactions, so the author recommends against this.

python multiprocessing example itertools multiple lists

Answers (1)

Related Questions