qwertylpc
qwertylpc

Reputation: 2106

Numba and Numpy Random Number interaction

I am trying to generate a bunch of random numbers quickly to do a MCMC.

I have the following benchmarks:

@njit
def getRandos(n):
    for i in prange(n):
        a = np.random.rand()


%timeit np.random.rand(1000000000)
13.1 s ± 287 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit getRandos(1000000000)
1.97 s ± 25.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Clearly the parallelization improves my runtime. However, I don't know how the seeding of the random number generation works. How can I ensure that these numbers are truly random? Do I have to randomly choose a seed somehow?

Upvotes: 0

Views: 3932

Answers (1)

K Jones
K Jones

Reputation: 447

You don't have an apples to apples comparison. The first call you make np.random.rand(1000000000) is spending a ton of time creating space and storing the random numbers, while the second call getRandos(1000000000) just generates values and drops them.

Here is the apples to apples comparison (which is about the same speed):

from numba import prange, njit
import numpy as np

@njit
def getRandos(n):
    a = np.zeros(n)
    for i in prange(n):
        a[i] = np.random.rand()
    return a

%timeit -n 100 getRandos(100000)
%timeit -n 100 np.random.rand(100000)

To answer your question however, reference the numba documentation here.

They don't allow you to create individual RandomState instances, but you can set the seed inside the definition.

@njit
def getRandos(n):
    np.random.seed(1111)
    a = np.zeros(n)
    for i in prange(n):
        a[i] = np.random.rand()
    return a


values = getRandos(100000)
values2 = getRandos(100000)
print(all(values == values2))  # True

Upvotes: 1

Related Questions