Non-repeating PRNG algorithm

Question

The following algorithm generates an array of non-repeating random numbers (the example is written in Fortran 95):

program test
implicit none

real :: x
integer :: i, aux
integer, dimension(100) :: y = 0

do i=2,100
  call RANDOM_NUMBER(x)
  aux = int(3 * x) + 1 ! random number: 1, 2 or 3
  aux = aux + y(i-1) ! adding previous selected number
  y(i) = MOD(aux,4) ! mod 4 gives the final result: 0, 1, 2 or 3
  print*, y(i)
enddo

end program test

On another discussion forum, a member proposed this algorithm as a solution to a challenge of how to output non-repeating numbers using a regular random number generator and a fixed amount of operations per loop (so for instance cycling when a random value is the same as the previous would not give a constant number of operations per loop).

His algorithm seems to work well, the results are uniformly distributed and there are no obvious patterns in any sub-strings of any in the output (I searched for sub-strings of sizes 2 to 5 and all behaved as expected). But what puzzles me in this solution is that the random number generator is outputting only three possible numbers (0, 1 or 2) and yet the whole algorithm outputs four possible results (0, 1, 2 or 3). How is this possible? I thought that mapping down the results of a PRNG could be done, but not mapping it up (e.g. if a PRNG produces numbers between 0 and 7, they can be mapped as 0-3 => 0 and 4-7 =>1, but a PRNG producing only 0's and 1's cannot produce results between 0-7 in a same loop – since one could obviously group three results in order to map 000 => 0, 001 => 1, ... 111 => 7).

Edit: this is the same algorithm but written in pseudocode, as this question is not related to Fortran nor any programming language in particular

x ← 0
do
  aux ← random number between 1 and 3
  aux ← aux + x
  x ← aux MOD 4
  print x
enddo

gilbertohasnofb · Accepted Answer

At first sight, the algorithm above seems to take as input random integers ranging between 0 and 2 (i.e. 3 values) and output random integers ranging between 0 and 3 (i.e. 4 values) for each cycle, which seems to be problematic due to upsampling. But actually the algorithm is always choosing among 3 options only, given that each value cannot be the same as the previous one. For instance, if the very first random integer selected is 0, there are three possible values for the next integer (1, 2 or 3), which is exactly what the range PRGN is providing. So the key is to realize that 3 random values are being mapped into 4 non-repeating random values, and this can be done without causing any unwanted patterns.

Therefore, there is no problem using MOD N+1 for a random input ranging from 0 to N, because the amount of information does not change with that. But when we use MOD N+2 or larger, we actually do observe patterns that shouldn't be there if the output was truly random. For instance, certain sequences of two consecutive numbers never appear: e.g. taking N = 3 (i.e. input between 0 and 2) and MOD 5, one will never see a 0 followed by a 4, since there is no input such that the expression ((input + 1) + 0) MOD 5 = 4 would be true.

Non-repeating PRNG algorithm

Answers (2)

Related Questions