Reputation: 816
In answers to this other question, the following solution is provided, curtesy of OpenBSD, rewritten for brevity,
uint32_t foo( uint32_t limit ) {
uint32_t min = -limit % limit, r = 0;
for(;;) {
r = random_function();
if ( r >= min ) break;
}
return r % limit;
}
How exactly does the line uint32_t min = -limit % limit
work? What I'd like to know is, is there a mathematical proof that it does indeed calculate some lower limit for the random number and adequately removes the modulo bias?
Upvotes: 7
Views: 375
Reputation: 223702
In -limit % limit
, consider that the value produced by -limit
is 2w−limit
, where w is the width in bits of the unsigned type being used, because unsigned arithmetic is defined to wrap modulo 2w. (The assumes the type of limit
is not narrower than int
, which would result in it being promoted to int
and signed arithmetic being used, and the code could break.) Then recognize that 2w−limit
is congruent to 2w modulo limit
. So -limit % limit
produces the remainder when 2w is divided by limit
. Let this be min
.
In the set of integers {0, 1, 2, 3,… 2w−1}, a number with remainder r (0 ≤ r < limit
) when divided by limit
appears at least floor(2w/limit
) times. We can identify each of them: For 0 ≤ q < floor(2w/limit
), q•limit
+ r has remainder r and is in the set. If 0 ≤ r < min
, then there is one more such number in the set, with q = floor(2w/limit
). Those account for all the numbers in the set {0, 1, 2, 3,… 2w−1}, because floor(2w/limit
)•limit
+ min
= 2w, so our counts are complete. For r different remainders, there are floor(2w/limit
)+1 numbers with that remainder in the set, and for min
−r other remainders, there are floor(2w/limit
) with that remainder in the set.
Now suppose we randomly draw a number uniformly from this set {0, 1, 2, 3,… 2w−1}. Clearly numbers with the remainders 0 ≤ r < min
might occur slightly more often, because there are more of them in the set. By rejecting one instance of each such number, we exclude them from our distribution. Effectively, we are drawing from the set { min
, min
+1, min
+2,… 2w−1}. The result is a distribution that has exactly floor(2w/limit
) occurrences of each number with a particular remainder.
Since each remainder is represented an equal number of times in the effective distribution, each remainder has an equal chance of being selected by a uniform draw.
Upvotes: 8