Hooked
Hooked

Reputation: 88128

Strange behavior of multiprocessing.Pool, why does it grab the wrong function?

I don't understand the behavior of Python's multiprocessing.Pool in this situation:

import multiprocessing

def f(x): return x
P = multiprocessing.Pool()
def f(x): return x*x

print (P.map(f, range(10)))
print (  map(f, range(10)))

which results in the output:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

At the point where the print statements are called, isn't there only a single f? Why does the Pool grab the first instance of f? I would expect that P.map and map to output the same results!

Upvotes: 3

Views: 301

Answers (1)

mgilson
mgilson

Reputation: 309891

This is an excellent question and I hope that someone with more knowledge/experience in Threading (and Multiprocessing) in general can come along and give a better answer, but here's my attempt:

Without really digging into the details here (after a quick look at the source), it appears that the Pool constructor spawns multiple threads for handling the queues of tasks. Those threads seemingly just sit around looking for things to be put into them. So, it looks like when the thread gets the request to run function __main__.f, it does, however, since it's never seen the updated definition of __main__.f, it uses the old definition.

Upvotes: 3

Related Questions