RogueDodecahedron
RogueDodecahedron

Reputation: 105

Python multiprocessing: avoid communication via module between processes

I want to perform the following task: In the main program 'main.py', I define some input parameters, do a calculation based on these parameters using a function f() and store the result. The function f() and some of the parameters are defined in a central module 'test.py'.

I have to do this for a large set of parameters and therefore want to give each CPU a set of parameters, perform the calculation and return the result which is then stored in an array 'data'.

The problem: each process needs to access and define values in the module 'test.py' and I want to avoid any communication/interference between processes.

I attached a minimal working example. The main file main.py and the module test.py

If one performs the calculation one sees that the results in 'data' are correct, but the print statement returns pairs (a, b) which do not correspond to the default values.

At first, I want to understand what happens here. It seems that each process prints (a, b) defined by previous processes, then defines the new values and yields the correct result.

Second, for the moment the program works (even for larger datasets and much more complicated calculations), but I don't want to risk wrong results by interference between processes. Is there a way to avoid any communication between the processes? Maybe each process gets a copy of the module and does the calculation using this copy?

Upvotes: 0

Views: 128

Answers (1)

PythonFan
PythonFan

Reputation: 78

I think your issue is that the "print" statement is printing what the parent main.py process sees as t.a and t.b, which you have assigned in your calc(x) function, (I don't think you can print in the child worker process but you definitely arent and therefore I don't see how you could see the default values (1,1). You print t.a and t.b BEFORE you assign the new value, and therefore it will print the old value?

If you really want to make sure all your processes are 100% independent of the main process you could pass all your arguements in the one structure ie. Define in your test.py

def f(struct):
    return (struct.a+struct.b)**struct.s

and in your main.py make a list of these structures. So I guess you define a structure and populate it

class myStruct():
    def __init__(self,s,a=1,b=1): ##Here you've set the default a and b values to 1
        self.a=a
        self.b=b
        self.s=s

You could then populate a list of these structures then pass the list to your multiprocessing.pool

Not sure if that was super helpful but i dont have enough reputation to make a comment.

Cheeers mate and goodluck.

Upvotes: 1

Related Questions