Python: parallelizing operations on a big array iteratively

Question

I am trying to parallelize operations on a big array. I summarized my approach in the code snippet below. Since the operations on the big array are costly, of the 100 processes, I want to parallelize 4 (i.e. n_cpus) at each iteration. After an iteration is finished, some garbage collection will be done and the next iteration should start. The main loop does the first iteration and terminates. I will be glad if some parallel processing expert can point out how I can correct my code to achieve the desired task.

from multiprocessing import Process

def train_model(model, big_array, i):
    model = do_operations_on(big_array)

# edit: this part is within a class
n_processes = 100
n_cpus = 4
models = [None for _ in range(n_processes)]
n_iterations = n_processes / n_cpus
for it in range(n_iterations):
    procs = [Process(target=train_model, \
        args=(models[it*n_cpus+i], big_array, i)) for i in range(n_cpus)]

    for p in procs: p.start()
    for p in procs: p.join()

Python: parallelizing operations on a big array iteratively

Answers (1)

Related Questions