Python multiprocessing: with and without pooling

Question

I'm trying to understand Python's multiprocessing, and have devised the following code to test it:

import multiprocessing

def F(n):
    if n == 0: return 0
    elif n == 1: return 1
    else: return F(n-1)+F(n-2)

def G(n):
    print(f'Fibbonacci of {n}: {F(n)}')

processes = []
for i in range(25, 35):
    processes.append(multiprocessing.Process(target=G, args=(i, )))

for pro in processes:
    pro.start()

When I run it, I tells me that the computing time was roughly of 6.65s.

I then wrote the following code, which I thought to be functionally equivalent to the latter:

from multiprocessing.dummy import Pool as ThreadPool

def F(n):
    if n == 0: return 0
    elif n == 1: return 1
    else: return F(n-1)+F(n-2)

def G(n):
    print(f'Fibbonacci of {n}: {F(n)}')

in_data = [i for i in range(25, 35)]

pool = ThreadPool(10)

results = pool.map(G, in_data)

pool.close()
pool.join()

and its running time was almost 12s.

Why is it that the second takes almost twice as the first one? Aren't they supposed to be equivalent?

(NB. I'm running Python 3.6, but also tested a similar code on 3.52 with same results.)

Gil Hamilton · Accepted Answer

The reason the second takes twice as long as the first is likely due to the CPython Global Interpreter Lock.

From http://python-notes.curiousefficiency.org/en/latest/python3/multicore_python.html:

[...] the GIL effectively restricts bytecode execution to a single core, thus rendering pure Python threads an ineffective tool for distributing CPU bound work across multiple cores.

As you know, multiprocessing.dummy is a wrapper around the threading module, so you're creating threads, not processes. The Global Interpreter Lock, with a CPU-bound task as here, is not much different than simply executing your Fibonacci calculations sequentially in a single thread (except that you've added some thread-management/context-switching overhead).

With the "true multiprocessing" version, you only have a single thread in each process, each of which is using its own GIL. Hence, you can actually make use of multiple processors to improve the speed.

For this particular processing task, there is no significant advantage to using multiple threads over multiple processes. If you only have a single processor, there is no advantage to using either multiple processes or multiple threads over a single thread/process (in fact, both merely add context-switching overhead to your task).

(FWIW: A join in the true multiprocessing version is apparently being done automatically by the python runtime so adding an explicit join doesn't seem to make any difference in my tests using time(1). And, by the way, if you did want to add join, you should add a second loop for the join processing. Adding join to the existing loop will simply serialize your processes.)

Python multiprocessing: with and without pooling

Answers (1)

Related Questions