Reputation: 33
I have a simple test program written and executed in python 3.6.3 below. It is being executed on a machine with 4 cores.
import multiprocessing
import time
def f(num):
print(multiprocessing.current_process(), num)
time.sleep(1)
if (num % 2):
raise Exception
pool = multiprocessing.Pool(5)
try:
pool.map(f, range(1,20))
except Exception as e:
print("EXCEPTION")
pool.close()
pool.join()
Output with pool = multiprocessing.Pool(5)
:
<ForkProcess(ForkPoolWorker-1, started daemon)> 1
<ForkProcess(ForkPoolWorker-2, started daemon)> 2
<ForkProcess(ForkPoolWorker-3, started daemon)> 3
<ForkProcess(ForkPoolWorker-4, started daemon)> 4
<ForkProcess(ForkPoolWorker-5, started daemon)> 5
<ForkProcess(ForkPoolWorker-2, started daemon)> 6
<ForkProcess(ForkPoolWorker-1, started daemon)> 7
<ForkProcess(ForkPoolWorker-4, started daemon)> 8
<ForkProcess(ForkPoolWorker-3, started daemon)> 9
<ForkProcess(ForkPoolWorker-5, started daemon)> 10
<ForkProcess(ForkPoolWorker-2, started daemon)> 11
<ForkProcess(ForkPoolWorker-1, started daemon)> 12
<ForkProcess(ForkPoolWorker-4, started daemon)> 13
<ForkProcess(ForkPoolWorker-3, started daemon)> 14
<ForkProcess(ForkPoolWorker-5, started daemon)> 15
<ForkProcess(ForkPoolWorker-2, started daemon)> 16
<ForkProcess(ForkPoolWorker-1, started daemon)> 17
<ForkProcess(ForkPoolWorker-3, started daemon)> 18
<ForkProcess(ForkPoolWorker-4, started daemon)> 19
EXCEPTION
But if I change the process count of the pool to be equal to or less than the number of cores on my machine, each call to f()
where num
is even does not print.
output with pool = multiprocessing.Pool(4)
:
<ForkProcess(ForkPoolWorker-1, started daemon)> 1
<ForkProcess(ForkPoolWorker-2, started daemon)> 3
<ForkProcess(ForkPoolWorker-3, started daemon)> 5
<ForkProcess(ForkPoolWorker-2, started daemon)> 7
<ForkProcess(ForkPoolWorker-1, started daemon)> 9
<ForkProcess(ForkPoolWorker-3, started daemon)> 11
<ForkProcess(ForkPoolWorker-3, started daemon)> 13
<ForkProcess(ForkPoolWorker-1, started daemon)> 15
<ForkProcess(ForkPoolWorker-2, started daemon)> 17
<ForkProcess(ForkPoolWorker-1, started daemon)> 19
EXCEPTION
I don't understand why these processes are being killed, especially when the exception isn't even thrown until after the print statement in the function. I really don't understand why it only happens when the process count in the pool is equal to or less than the number of cores on the machine.
Upvotes: 3
Views: 890
Reputation: 1763
referring to the specification of multiprocessing.Pool.map
you can see one optional argument chunksize
, if you specify it to 1, i.e. pool.map(f, range(1,20), 1)
, then you would yield the expected result.
if you increase the chunk size (= 6 for example), you might see:
<SpawnProcess(SpawnPoolWorker-1, started daemon)> 1
<SpawnProcess(SpawnPoolWorker-4, started daemon)> 7
<SpawnProcess(SpawnPoolWorker-3, started daemon)> 13
<SpawnProcess(SpawnPoolWorker-2, started daemon)> 19
this suggests that number of chunksize
of tasks are assigned to a single thread in the Pool, when you raise exception during each thread, of course the tasks in the remaining chuck would not be executed.
From here you can know that the default value for chunksize
is 2 in your case, the reason of existence of such variable, to be seen fairly easily, is to reduce the number of new threads which need to be initialized (which might save both resources and processing time, when you have appropriate chunksize).
Upvotes: 4