How to wait for the worker processes in Python multiprocessing.pool.Pool without closing it?

Question

I'm benchmarking this script on a 6-core CPU with Ubuntu 22.04.1 and Python 3.10.6. It is supposed to show usage of all available CPU cores with par function vs. a single core with ser function.

import numpy as np
from multiprocessing import Pool
import timeit as ti


def foo(n):
  return -np.sort(-np.arange(n))[-1]


def par(reps, bigNum, pool):
  for i in range(bigNum, bigNum+reps):
    pool.apply_async(foo, args=(i,))


def ser(reps, bigNum):
  for i in range(bigNum, bigNum+reps):
    foo(i)


if __name__ == '__main__':
  bigNum = 9_000_000
  reps = 6

  fun = f'par(reps, bigNum, pool)'
  t = 1000 * np.array(ti.repeat(stmt=fun, setup='pool=Pool(reps);'+fun, globals=globals(), number=1, repeat=10))
  print(f'{fun}:  {np.amin(t):6.3f}ms  {np.median(t):6.3f}ms')

  fun = f'ser(reps, bigNum)'
  t = 1000 * np.array(ti.repeat(stmt=fun, setup=fun, globals=globals(), number=1, repeat=10))
  print(f'{fun}:  {np.amin(t):6.3f}ms  {np.median(t):6.3f}ms')

Right now, par function only shows the time to spin the worker processes. What do I need to change in function par, in order to make it wait for all worker processes to complete before returning? Note that I would like to reuse the process pool between calls.

How to wait for the worker processes in Python multiprocessing.pool.Pool without closing it?

Answers (1)

Related Questions