Furkan Gözükara
Furkan Gözükara

Reputation: 23860

True random is not true random I am really confused

Alright I tested the way below.

Generated x times random numbers between 0~x and then checked the ones that were not generated.

I would assume that it would be very close to 100%. What I mean is all numbers between 0~x are generated.

But results are shocking. About 36% of the numbers are missing.

Is my random function not really random?

Here below my random class:

private static Random seedGenerator = new Random();

private static ThreadLocal<Random> random = 
    new ThreadLocal<Random>(SeededRandomFactory);

private static Random SeededRandomFactory()
{
    lock (seedGenerator)
        return new Random(seedGenerator.Next());
}

public static int GenerateRandomValueMin(int irRandValRange, int irMinValue)
{
    return random.Value.Next(irMinValue, irMinValue + irRandValRange);
}

Here the below results:

Between 0-10, missing numbers count: 4, percent: 40%
Between 0-100, missing numbers count: 36, percent: 36%
Between 0-1000, missing numbers count: 369, percent: 36,9%
Between 0-10000, missing numbers count: 3674, percent: 36,74%
Between 0-100000, missing numbers count: 36583, percent: 36,58%
Between 0-1000000, missing numbers count: 367900, percent: 36,79%
Between 0-10000000, missing numbers count: 3678122, percent: 36,78%
Between 0-100000000, missing numbers count: 36797477, percent: 36,8%

Here the code how I check:

File.WriteAllText("results.txt", "");

int irFirst = 10;

for (int i = 0; i < 8; i++)
{
    HashSet<int> hsGenerated = new HashSet<int>();

    for (int k = 0; k < irFirst; k++)
    {
        hsGenerated.Add(GenerateRandomValue.GenerateRandomValueMin(irFirst, 0));
    }

    int irNotFound = 0;
    for (int k = 0; k < irFirst; k++)
    {
        if (hsGenerated.Contains(k) == false)
            irNotFound++;
    }

    string srSonuc = 
        string.Format(
            "Between 0-{0}, missing numbers count: {1}, percent: {2}%", 
            irFirst, irNotFound,
            Math.Round((Convert.ToDouble(irNotFound)/Convert.ToDouble(irFirst))*100.0, 2).ToString()
            );

    using (StreamWriter w = File.AppendText("sonuclar.txt"))
    {
        w.WriteLine(srSonuc);
    }

    irFirst = irFirst * 10;
}

Upvotes: 1

Views: 270

Answers (3)

DasKr&#252;melmonster
DasKr&#252;melmonster

Reputation: 6060

As mentioned in the comments, your testing method is off.

You draw x times a number between 0 and x. The probability that a specific number is not drawn is:

((x-1)/x)^x

As x approaches infinity, p will go towards 1/e (or approx. 36.7879441%) And this is the number you are seeing in your results. Also, as x approaches infinity, you will observe this probability as outcome of your sample (Law of large numbers)

This has to do with probability. When you have a bowl with a red and a white marble. And you take one, put it back an take another one you cannot guarantee that you see both. You could take the red one twice. You are doing the same thing with more objects.


To elaborate on true true randomness:

I would expect close to 99% percent instead of 64%. Or at least 90%+ percent. So you say this isn't possible with current technology

That is simple. Thanks to modern math, technology and my super powers I can tell you how to do that: You need more draws than numbers to choose from. The formula becomes:

more math

where n is you desired percentage of missing numbers. For example if you are willing to accept 5% numbers missing, you must draw three times as many random numbers. For a 1% chance, you need to iterate 4.6 times the maximum number.

This math assumes a perfectly uniform random number generation.

Upvotes: 13

Casperah
Casperah

Reputation: 4564

If you want 100 million unique random number you could do something like this:

Now using Fisher-Yates suffle algorithm:

List<int> numbers = new List<int>(100000000);
for (int i = 0; i < numbers.Capacity; i++)
{
    int rnd = random.Next(numbers.Count + 1);
    if (rnd == numbers.Count)
        numbers.Add(i);
    else
    {
        numbers.Add(numbers[rnd]);
        numbers[rnd] = i;
    }
}

By the way you could calculate irNotFound much faster:

int irNotFound = irFirst - hsGenerated.Count;

Good luck with your quest.

Upvotes: -1

David Heffernan
David Heffernan

Reputation: 613481

Your results are exactly what is to be expected from a uniform distribution where you sample with replacement.

Consider the simplest possible example. You have a coin and toss it twice. So we assume that we are sampling from a uniform discrete distribution.

The possible outcomes, which occur with equal probability of 0.25 are:

TT
TH
HT
HH

As you can see, only two of the four outcomes have both heads and tails.

This is known as sampling with replacement. So, once we have sampled a tails, then we "put it back in the bag", and it could come out again on the next sample.

Now suppose we sample without replacement. In that case there are two possible outcomes:

TH
HT

And as you see, each possible value appears exactly once.

Essentially your expectation for the results is not correct. As another example, suppose you toss a coin and it comes down tails. What do you expect will happen on the next toss. You are arguing that the coin must now come down heads. But that is clearly nonsense.


If you did want to sample without replacement, and it's not clear that's really what you want, then you do so with the Fisher-Yates shuffle.

Upvotes: 5

Related Questions