Hosea
Hosea

Reputation: 205

Python Multiprocessing pool with multiple arguments and void function

I'm trying to use Python multiprocessing library with multiple arguments on a void function that does not return anything. Here is my minimal working example.

import numpy as np
from multiprocessing import Pool

dim1 = 2
dim2 = 2

test1 = np.zeros((dim1,dim2))
test2 = np.zeros((dim1,dim2))

iteration = []
for i in range(0,dim1):
    for j in range(0,dim2):
        iteration.append((i,j))
        
def testing(num1,num2):
    test1[num1,num2] = 1
    test2[num1,num2] = 2
    
if __name__ == '__main__':
    pool = Pool(processes=4)  
    pool.starmap(testing, iteration)
    
print(test1)
print(test2)

The problem here is that variable test1 and test2 prints zero array as first initialized. Instead, what I what for test1 is an array of 1s and an array of 2s for test2. What I would like the code

if __name__ == '__main__':
    pool = Pool(processes=4)  
    pool.starmap(testing, iteration)

to do is this:

testing(0,0)
testing(1,0)
testing(0,1)
testing(1,1)

I've seen some related posts like this. The difference between this post and mine is that my function is a void function, and rather than returning the variables, I'd like the function to just change the values of the variables.

Upvotes: 4

Views: 855

Answers (1)

Mike67
Mike67

Reputation: 11342

To update an array across multiple processes using a global array without returning results:

  • Use the multiprocessing.Array class to store the array data.
  • Use the initializer parameter when creating the pool to pass the arrays to the processes.

Note that the Array is 1 dimensional so it must be reshaped for update and display.

Try this code:

import numpy as np
from multiprocessing import Pool, Array

dim1 = 2
dim2 = 2

def init(tt1,tt2):  # receive shared arrays
   global test1,test2
   test1,test2 = tt1,tt2

def testing(num1,num2):
    t1 = np.frombuffer(test1.get_obj()).reshape((dim1, dim2))  # need to reshape to 2D array
    t2 = np.frombuffer(test2.get_obj()).reshape((dim1, dim2))
    t1[num1,num2] = 1
    t2[num1,num2] = 2
   
if __name__ == '__main__':
    tt1 = Array('d', dim1*dim2)  # 1 dimensional arrays
    tt2 = Array('d', dim1*dim2)

    iteration = []
    for i in range(0,dim1):
        for j in range(0,dim2):
            iteration.append((i,j))
            
    pool = Pool(processes=4, initializer=init, initargs=(tt1,tt2))   # pass shared arrays to processes
    pool.starmap(testing, iteration)
    
    # still have access to the shared arrays
    t1final = np.frombuffer(tt1.get_obj()).reshape((dim1, dim2))
    t2final = np.frombuffer(tt2.get_obj()).reshape((dim1, dim2))
    print(t1final, t2final, sep='\n')

Output

[[1. 1.]
 [1. 1.]]
[[2. 2.]
 [2. 2.]]

Upvotes: 1

Related Questions