Reputation: 11
For a PRNG like the Mersenne Twister which has a period of 2^19937-1, are there precisely that many states for the PRNG? i.e. the states start repeating at that point because there are no more states for the PRNG to be in?
As a follow up, what is the distribution over the next state given you exist in some current state of your PRNG? I am running long simulations and interested in finding out when it is likely that one simulation using a random seed will run into a state that exists in another simulation using a different random seed.
Upvotes: 1
Views: 686
Reputation: 19855
Pseudo-Random Number Generators (PRNGs) work by applying a deterministic function to the current state in order to determine the next state, then projecting that state onto the number of return bits. Asking "what is the distribution over the next state given you exit in some current state" is essentially meaningless. Since the transformation is deterministic, from any given state there is one and only one state that will be the next one. Eventually you will inevitably come back to some previously observed state, and from that point on the states and their corresponding output projections will repeat in an identical order, and we say that your PRNG has cycled. The cycle length is determined by how many unique states are reachable from your starting point (seed state), and is bounded by but not necessarily equal to the size of the state space. For instance, there are functions that will only produce even numbers if seeded by an even number, or odd numbers if seeded by an odd number. Neither of these cases would produce all possible integers before repeating.
Mersenne Twister achieves a cycle length of 219937-1 by having a state space with 19937 bits. (If I recall correctly, the -1 is because a state of all 0's is unreachable from any of the other non-zero states.) As for the likelihood of overlapping states in two runs, forget about it. To give you an idea of how big 219937-1 is, consider the following: Physicists estimate that there are about 1080 = 2266 subatomic particles in the known universe. If you've seeded your runs independently using full state initialization (for instance by reading 2.5KB from /dev/random
into a buffer of bytes), even if you're using 1010 random numbers you're asking the equivalent of how likely you are to just happen to pick the same subatomic particle twice out of 219937-300 universes. It just ain't gonna happen.
Upvotes: 6