Alex Gao
Alex Gao

Reputation: 2091

Is numpy.random.RandomState() automatically called whenever rand() is called?

Languages like C++ require the programmer to set the seed of the random number generator, otherwise its output will always be the same. However, libraries like numpy do not require you to initialize the seed manually.

For example, code like:

from numpy.random import rand
rand()

gives a different result every time.

Does this mean that numpy.random.RandomState(seed=None) is called every time you call rand?

Upvotes: 3

Views: 2754

Answers (2)

ali_m
ali_m

Reputation: 74182

The numpy.random module is like the random module from the Python standard library, in that the functions in numpy.random are bound methods of a hidden generator object that is instantiated when you import the module. This hidden numpy.random.RandomState instance currently lives in np.random.mtrand._rand (although you shouldn't rely on it always being there in future versions of numpy):

print(np.random.rand)
# <built-in method rand of mtrand.RandomState object at 0x7f50ced03660>

# note the same memory address of the RandomState object:
print(np.random.mtrand._rand)
# <mtrand.RandomState object at 0x7f50ced03660>

The hidden RandomState instance will be seeded only once when you import the module (unless you then set the seed explicitly using np.random.seed()). If a new seed was chosen every time you called rand() then there would be no way to create reproducible sequences of pseudorandom numbers.

The situation looks something like:

# implicit RandomState created and seeded
from numpy import random

# # we could subsequently re-seed the hidden RandomState, e.g.:
# random.seed(None)

# different random variates
r1 = random.rand(1)
r2 = random.rand(1)
r3 = random.rand(1)
# ...

The automatic seeding is equivalent to np.random.RandomState(None), which uses some platform-dependent source of randomness (usually /dev/urandom on *nix) to set the seed.

Upvotes: 2

abarnert
abarnert

Reputation: 365807

Does that mean numpy.random.RandomState(seed=None) is called every time you call rand?

No, it means the RandomState is seeded once at startup. If it were re-seeded every time you call rand, then there would be no way to explicitly ask for a repeatable pattern.

The same is true for the Python stdlib's random module.

And, despite what you say about C++, it's also true for the C++ stdlib's <random> functions.

All of these document that the default seed, if you don't do anything, comes from something like the system time or a system entropy generator (like /dev/random on most *nix systems).

This is not the case for C's rand (which is still there in C++, although you should treat it as deprecated*), but only because C goes out of its way to require that startup must do the equivalent of calling srand(1).


If you're interested in exactly how the "once at startup" works in NumPy:

  • At the top level of the numpy.random module (which gets run the first time you import numpy.random or from numpy.random import something in your code), it constructs a global RandomState, with the default arguments (meaning seed=None).
  • RandomState's initializer just passes the seed argument on to the seed method.
  • RandomState.seed, when called with None, uses an appropriate source of system entropy for your platform (like /dev/urandom).
  • When you call the top-level rand, it uses that global RandomState.

* Not because of this problem; it's easy enough to remember to call srand at the start of your program. But a PRNG that explicitly doesn't guarantee a cycle length longer than 32767, an unbiased distribution, etc. is just a bad idea for almost anything…

Upvotes: 6

Related Questions