Reputation: 794
I wrote the following code
from multiprocessing import Pool
class A:
def __init__(self) -> None:
self.a = 1
def incr(self,b):
self.a += 1
print(self.a)
def execute(self):
with Pool(2) as p:
p.starmap(self.incr, [[1],[1],[1],[1]])
a = A()
a.execute()
print(a.a)
The output is 2 2 2 2 1. I want to understand what exactly happens in this scenario. Does the pool create four copies of self? If so how is this copying done?
Upvotes: 4
Views: 669
Reputation: 1533
In this scenario, the file is run linearly until a.execute()
is called and reaches the call to starmap
. What multiprocessing
does is to create two more processes. The way this is done depends is limited by the OS and can (in some cases be selected using multiprocessing.set_start_method
. The main differences between the methods is that with 'fork'
methods the new process are either created copying the current process (it is more complicated, but that's the idea), while in the 'spawn'
methods a new python interpreter is started, the file is re-executed, and the call is performed. In this later case, as the file is re-executed, it is very important to use the if __name__ == "__main__":
guard.
As each process has its own copy of a
, what happens in one process does not affect the other copies of a
: even if all copies of have a self.a
value of 2
, the original a
is unchanged.
You can test the different methods for starting process with this piece of code (delays added for clarity):
from multiprocessing import Pool, set_start_method, get_start_method
import os
from time import sleep
globalflag = "Flag"
class A:
def __init__(self) -> None:
self.a = 1
def incr(self, b):
sleep(b/10)
print("Process id:", os.getpid(), " incrementing")
print("Flag = ", globalflag)
self.a += 1
print(self.a)
def execute(self):
with Pool() as p:
p.starmap(self.incr, [[1],[2],[3],[4]])
print("Running module:", __name__)
if __name__ == "__main__":
method = 'fork'#'spawn' # 'fork' 'forkserver'
if get_start_method(allow_none=True) is None:
print("Setting start method to ", method)
set_start_method(method)
else:
print('Start method already set: ', get_start_method())
a = A()
globalflag = "Flog"
a.execute()
print(a.a)
output with method='fork'
:
Running module: __main__
Start method already set: fork
Process id: 6687 incrementing
Flag = Flog
2
Process id: 6688 incrementing
Flag = Flog
2
Process id: 6689 incrementing
Flag = Flog
2
Process id: 6690 incrementing
Flag = Flog
2
1
output with 'spawn':
Running module: __main__
Setting start method to spawn
Running module: __mp_main__
Running module: __mp_main__
Running module: __mp_main__
Running module: __mp_main__
Running module: __mp_main__
Process id: 7096 incrementing
Flag = Flag
2
Running module: __mp_main__
Process id: 7097 incrementing
Flag = Flag
2
Running module: __mp_main__
Process id: 7098 incrementing
Flag = Flag
2
Running module: __mp_main__
Process id: 7100 incrementing
Flag = Flag
2
1
Upvotes: 3
Reputation: 3033
try running this changing with Pool(2) as p
with:
with Pool(1) as p
, with Pool(2) as p
, with Pool(4) as p
, with Pool(8) as p
and have a look at the different outputs
from multiprocessing import Pool
import os
print('PID ;',os.getpid()) ## get the current process id or PID
class A:
def __init__(self) -> None:
self.a = 1
def incr(self,b):
self.a += 1
print(self.a)
print('PID ;',os.getpid())
def execute(self):
with Pool(2) as p:
p.starmap(self.incr, [[1],[1],[1],[1]])
a = A()
a.execute()
print(a.a)
print('PID ;',os.getpid())
remember:
os.getpid()
:
Return the current process id.
Upvotes: -2