MeLight
MeLight

Reputation: 5565

Why this randomization method gives skewed results?

I'm using two different randomization methods, while one gives results whose variance is what I expect, the other method gives results which are skewed and in a very consistent way too.

The methods:

function randomA() {
    var raw = Number((Math.random()+'').substr(2));
    return raw % NUM_OF_POSSIBLES;
}

function randomB() {
    var raw = Math.round(Math.random()*10000);
    return raw % NUM_OF_POSSIBLES;
}

When NUM_OF_POSSIBLES = 2 the first method (randomA()) results in a rather consistent number of zeros (64%) and 36% of 1s. While radnomB() does pretty much 50/50. If NUM_OF_POSSIBLES = 5 the first method again is skewed in a pretty consistent way: 0: 10%, 1: 23%, 2: 22%, 3: 22%, 4: 23%, while the second one gives around 20% to each result.

You can find the full code with multiple tests here: jsfiddle

Why is the first method skewed, and also why is the skewing consistent?

Upvotes: 4

Views: 95

Answers (3)

Leonid Shevtsov
Leonid Shevtsov

Reputation: 14179

You get the bias because the results of Math.random() aren't always of the same length, so for example 0.123 and 0.1235 count towards the "ones" heap.

You could think that it'd be corrected if you even out the lengths with trailing zeroes, but that also won't be correct, because 0.123 could be a rounded 0.122999999999.

The real error of the first method is relying on the least significant digit of an imprecise fraction (both %2 and %5 are only affected by the last digit), which had suffered rounding errors when converted from binary to decimal for presentation.

The original, binary form of the fraction is probably uniformly distributed, but there's no way of reading it in Javascript.

Now, if someone would explain the distribution of trailing digits of a rounded decimal fraction...

Upvotes: 1

Vlad Nikitin
Vlad Nikitin

Reputation: 1951

I have found that the reason of why randomA is working in such way is because java script uses floating poing numbers with 52+1 Digits (table under Basic formats chapter). So after random function returns too big value it is rounded, for example

Math.pow(2, 54) +1 
//18014398509481984
Math.pow(2, 54)  
//18014398509481984
Math.pow(2, 54)  -1
//18014398509481984

all return the same value wich is divided by 2 (because of rounding).

for more understanding you can play and see how it looks in biniry format , examples :

parseInt(Math.pow(2, 54) - 2).toString(2)
//"111111111111111111111111111111111111111111111111111110"
parseInt(Math.pow(2, 54) - 3).toString(2)
//"111111111111111111111111111111111111111111111111111100"
parseInt(Math.pow(2, 54) ).toString(2)
//"1000000000000000000000000000000000000000000000000000000"
parseInt(Math.pow(2, 54) -1).toString(2)
//"1000000000000000000000000000000000000000000000000000000"

Upvotes: 1

Ted Hopp
Ted Hopp

Reputation: 234797

I'm not entirely sure, but my guess is that it has to do with the rounding mode used when JavaScript formats a number as a string. In the first case, your result depends on the choice of last digit, which is sensitive to this rounding. If it's biased toward even numbers, that would explain your results. (In the case of NUM_OF_POSSIBLES == 5, it would be because of a deficit of 5s as the last digit.) In the second routine, the result depends on an intermediate digit of the string representation, which is pretty much isolated from that influence.

You might have better results by chopping off the last digit or two in the first routine.

EDIT I just confirmed experimentally that if the first routine is changed to chop off the last digit:

function randomA() {
    var raw = String(Math.random());
    raw = raw.substring(2, raw.length-1);
    return raw % NUM_OF_POSSIBLES;
}

then the bias appears to be gone when NUM_OF_POSSIBLES == 2 or 5.

Upvotes: 3

Related Questions