spurra
spurra

Reputation: 1025

Python - Are calls to functions of numpy.random thread safe?

According to this answer, it isn't. But this has not been consistend with what I've observed so far. Consider the following script:

import numpy as np
from multiprocessing.dummy import Pool
from queue import Queue

SIZE=1000000
np.random.seed(1)
tPool = Pool(100)
q1 = Queue()

def worker_thread(i):
    q1.put(np.random.choice(100, 5))

tPool.map(worker_thread, range(SIZE))

q2 = Queue()
np.random.seed(1)
for i in range(SIZE):
    q2.put(np.random.choice(100, 5))

n = 0
for i in range(SIZE):
    n += (q1.get() == (q2.get()))

print(n)

Basically what I'm testing here is if SIZE number of calls will generate the same sequence in the multi-threaded environment as in the single-threaded environment. For me this will output n=SIZE. Of course this could be just chance, so I ran it a few times and been having consistent results. So my question is, are calls to functions of the numpy.random package thread-safe?

Upvotes: 2

Views: 1489

Answers (1)

Maxim
Maxim

Reputation: 53788

I've run your script several times on my machine and got arrays of 999995, 999992 nearly as often as 1000000 (python 3.5.2, numpy 1.13.3). So the answer you're referring to is correct: np.random may produce a different result in multi-threaded environment.

You can see it yourself if you increase the pool size, say to 1000, and sample size, say to 50. I was able to achieve 100% inconsistency even for a smaller SIZE=100000.

Upvotes: 2

Related Questions