drhagen
drhagen

Reputation: 9532

Multiple independent random number streams from single seed

I have n similar analyses each using m_i pseudo-random number streams (m_i may vary between analyses). Each analysis has its own random number seed so that the random numbers are uncorrelated between analyses.

My problem is that I need to create the m_i streams from the single seed. The analysis is currently written in Numpy, so solutions its Mersenne Twister are ideal, but I am open to solutions in other mature libraries. I considered these possibilities:

  1. Use the seed to create a random number stream, draw m_i integers, and use those integers as the seeds for m_i random streams. This is not good because of the birthday paradox. There are 2^32 (~4 billion) seeds, but if I would end up with a collision (two streams started with the same seed) after 2^16 (~60000).

  2. Multiply the seed by some constant m_max for each stream index by 1 to get that stream's seed. (e.g. with seed=2 and m_max=10000, the analysis will use seeds 20001, 20002, 20003, etc.). This is undesirable because all analyses will be limited to m_max streams before there will be collisions, and if m_max is too big, the number of analyses is limited to 2^32/m_max.

  3. Use the seed to create a random number stream, draw 624 32-bit integers per stream needed, and set the state of each stream to the 624 integers drawn for it. This seems to be perfect except that I have no idea if 624 random integers is actually a valid internal state for the Mersenne Twister (can it be arbitrary bits?). I also don't know if there is any hidden correlation between the integers (maybe they are the same stream just shifted by 624).

Is there a standard way of doing this?

Upvotes: 4

Views: 1206

Answers (1)

Peter O.
Peter O.

Reputation: 32878

Approach 3 can work, as the seed for any PRNG can be as long as that PRNG's state (for example, Mersenne Twister has a state length of 19968 bits, or 624 * 32 bits, so can accept a seed of up to that many bits — it's not limited to 32 or 64 bits, as is the practice with many APIs that implement Mersenne Twister). However you should use a PRNG of an unrelated design to Mersenne Twister, such as PCG, to seed that PRNG, then draw the 624-integer seed as you suggest. (Or, if you don't require reproducible results or if you will save the 624-integer seeds generated this way, you can use a cryptographic RNG, such as os.urandom() or secrets.SystemRandom, to draw those seeds instead.) My article on RNGs suggests several PRNGs with different designs.


UPDATE (Dec. 1, 2019):

If you're using NumPy, please note that in the meantime, NumPy 1.17 introduces a new random number generation system; it uses so-called bit generators, such as PCG, and random generators, such as the new numpy.random.Generator. It was the result of a proposal to change the RNG policy. The NumPy documentation now has detailed information on seeding RNGs in parallel, and on multithreading RNGs. I also have general information on seeding multiple processes (not NumPy-specific) in "Seed Generation for Noncryptographic PRNGs."

Upvotes: 3

Related Questions