Reputation: 3267
If you try running this code:
from multiprocessing import Pool
from multiprocessing.pool import ThreadPool
import random
def do_thing(upper):
print(random.randint(0,99))
random.seed(0)
with ThreadPool(1) as pool:
list(pool.imap(do_thing, [99]))
with ThreadPool(1) as pool:
list(pool.imap(do_thing, [99]))
with Pool(1) as pool:
list(pool.imap(do_thing, [99]))
with Pool(1) as pool:
list(pool.imap(do_thing, [99]))
You will find that the ThreadPool
s print consistent integers across multiple runs, but the Pool
s don't. I get from here that we can't guarantee in which order the processes will be created so in many cases it would be impossible to guarantee consistent results. But in my case, there are only so many orders this could happen in, but there are many different outcomes. So I don' think that the linked post is explaining what's happening here.
Note that I want to "propagate" the seed, not reseed with the same number. I don't want the outputs to be all the same.
Also, it looks like this might be possible with a manager, but just wondering if there's an easier "obvious" way that I don't know about.
Upvotes: 0
Views: 335
Reputation: 44303
One way (I think the only practical way) of solving this problem is to come up with a managed random number generator class that you can pass to your worker function as an argument (the option chosen here) or used to initialize each process in the pool as a global variable. I have modified your code slightly so that instead of printing the random number, function do_thing
returns the value and I have also modified the main process to create a pool size of 8 and to invoke do_thing
8 times. Finally, to ensure that all 8 processors each process one submitted task (I have 8 cores) instead of the first process processing all 8 tasks, which is a possibility when the job submitted completes very quickly, I have added a call to sleep
to do_thing
:
from multiprocessing import Pool, current_process
from multiprocessing.managers import BaseManager
import random
from functools import partial
class RandomGeneratorManager(BaseManager):
pass
class RandomGenerator:
def __init__(self):
random.seed(0)
def get_random(self):
return random.randint(0, 99)
def do_thing(random_generator, upper):
import time
time.sleep(.2)
print(current_process())
return random_generator.get_random()
# Required for Windows:
if __name__ == '__main__':
RandomGeneratorManager.register('RandomGenerator', RandomGenerator)
with RandomGeneratorManager() as manager:
random_generator = manager.RandomGenerator()
# random_generator will be the first argument to do_thing:
worker = partial(do_thing, random_generator)
with Pool(8) as pool:
print(pool.map(worker, [0] * 8))
with Pool(8) as pool:
print(pool.map(worker, [0] * 8))
Prints:
<SpawnProcess name='SpawnPoolWorker-3' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-2' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-4' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-5' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-7' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-6' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-9' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-8' parent=23956 started daemon>
[49, 97, 53, 5, 33, 65, 51, 62]
<SpawnProcess name='SpawnPoolWorker-14' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-10' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-13' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-11' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-17' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-15' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-16' parent=23956 started daemon>
<SpawnProcess name='SpawnPoolWorker-12' parent=23956 started daemon>
[38, 61, 45, 74, 27, 64, 17, 36]
Upvotes: 1