Reputation: 2106
I am trying to generate a bunch of random numbers quickly to do a MCMC.
I have the following benchmarks:
@njit
def getRandos(n):
for i in prange(n):
a = np.random.rand()
%timeit np.random.rand(1000000000)
13.1 s ± 287 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit getRandos(1000000000)
1.97 s ± 25.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Clearly the parallelization improves my runtime. However, I don't know how the seeding of the random number generation works. How can I ensure that these numbers are truly random? Do I have to randomly choose a seed somehow?
Upvotes: 0
Views: 3932
Reputation: 447
You don't have an apples to apples comparison. The first call you make np.random.rand(1000000000)
is spending a ton of time creating space and storing the random numbers, while the second call getRandos(1000000000)
just generates values and drops them.
Here is the apples to apples comparison (which is about the same speed):
from numba import prange, njit
import numpy as np
@njit
def getRandos(n):
a = np.zeros(n)
for i in prange(n):
a[i] = np.random.rand()
return a
%timeit -n 100 getRandos(100000)
%timeit -n 100 np.random.rand(100000)
To answer your question however, reference the numba documentation here.
They don't allow you to create individual RandomState instances, but you can set the seed inside the definition.
@njit
def getRandos(n):
np.random.seed(1111)
a = np.zeros(n)
for i in prange(n):
a[i] = np.random.rand()
return a
values = getRandos(100000)
values2 = getRandos(100000)
print(all(values == values2)) # True
Upvotes: 1