Koy
Koy

Reputation: 628

Exactly one high card in a 6 card hand using simulations

So I've got a probability problem that (out of pure boredom) I decided to try and solve using simulations. Problem: What is the probability of drawing exactly one high card in a 6 card hand.

Now, this problem is specifically about some game that's played in my country so for some weird reason there are 21 high cards, but that's not important.

Solving this problem by hand, using basic combinatorics I got: enter image description here

And now, here's the way I simulated it in C:

The main function:

int main(void)
{

    srand(time(0));
    int deck[52];

    int i;
    for(i = 0; i<21; i++) deck[i] = 1;
    for(i = 21; i<52; i++) deck[i] = 0;

    int n;
    printf("# of simulations: ");
    scanf("%d", &n);

    int memo[52];

    int hits = 0;
    for(i = 0; i<n; i++){
      clear_memo(memo);
      hits += simulate(deck, memo);
    }
    printf("Result: %lf\n", (double)hits/n);


}

So the deck is an array of 52 numbers where the first 21 have the value 1 (high cards) and the other 31 elements have the value 0 (low cards).

The memo will be sent to the simulation function each time to keep track of which cards have already been drawn. The memo also gets reset every time using the clear_memo function which does nothing but set all the values to zero.

Then it calls the simulation functions and counts the hits.

Here's the simulation function:

int simulate(int * deck, int * memo){

  //I draw the first card separetly in order to initialize the had_high variable
  int index = ( rand() % 52 );
  int card = deck[index];
  int had_high = (card == 1);
  memo[index] = 1;

  //printf("%d ", index);

  int i = 1;
  while(i < 6){

    int draw = (rand() % 52);
    //printf("%d ", draw);
    if(memo[draw]) continue;

    index = draw;
    card = deck[index];
    memo[index] = 1;

    if(card){
        if(had_high) { //meaning there are 2 high cards, no hit
          //printf("\n");
          return 0;
        }
        had_high = 1; //if not, then this is the first high card
    }

    i++;
  }
  printf("\n");
  return had_high; //the function would've reached this point even if all the cards had been low
                   //therefore I return had_high instead of just 1

}

The simulation function itself works, I've tested it separately a lot of times and there seem to be no problems with it.

However, when I run the program with a high number of simulations (100k or 1m) the result is always approx. 0.175 which is not what I got with my by hand calculation.

I am reasonably certain that my by hand calculation is correct (but correct me if I'm wrong there as well).

If I'm right about the by hand calculations then there must be something wrong with how I simulated this event. One of my thoughts was that it had something to do with the rand function and how it's pseudo-random, but I really don't know as it is very hard to test anything that works with random numbers.

Any ideas?

Thanks.

EDIT: As per request of klutt:

void clear_memo(int * memo){
  int i = 0;
  for(;i<52;i++) memo[i] = 0;
}

Upvotes: 1

Views: 89

Answers (2)

klutt
klutt

Reputation: 31409

Your calculation is wrong

What you are calculating is the probability that the first card is high and all of the rest are low. Or at least it seams that you're trying to do so. You're slightly off. It should be (21/52)*(31/51)*(30/50)*(29/49)*(28/48)*(27/47) = 0.02921...

You should multiply this by 6, since the high card can appear anywhere. Then you have the probability for exactly one high, which is 0.17526

rand() % n has a non-uniform distribution

That being said, be aware that the random generator in C is not very good to use in this way. Depending on how you use it it can get a horrible distribution. If you are using C++, you can use:

std::random_device rd;
std::default_random_engine generator(rd());
std::uniform_int_distribution<int> distribution(0,51);
int index = distribution(generator);

In simulations like this, this may have a huge effect. In this case it had a small effect. I tested both the standard rand() method and the C++ variant and ran the simulation 4 times with 10 million iterations each time:

Using rand() % 52:

Result: 0.175141
Result: 0.175074
Result: 0.175318
Result: 0.175506

Using distribution(generator):

Result: 0.175197
Result: 0.175225
Result: 0.175228
Result: 0.175293

As you can see, the deviation is smaller. So if accuracy matters, either consider switching to C++ and use those methods, or find a way to get a good distribution. And if you are doing a simulation to numerically calculate a probability, then it does matter.

Upvotes: 0

Weather Vane
Weather Vane

Reputation: 34585

My program gives the same result as yours - about 0.175

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(void)
{
    int deck[52];
    int successes = 0;

    srand((unsigned)time(NULL));
    for(int run = 0; run < 100000; run++) {
        for(int n = 0; n<52; n++) {
            deck[n] = n;
        }
        int cards = 52;
        int highs = 0;
        for(int n=0; n<6; n++) {
            int index = rand() % cards;
            if(deck[index ] < 21) {
                highs++;
            }
            deck[index] = deck[--cards];
        }
        if(highs == 1) {
            successes++;
        }
    }
    printf("Probability of drawing exactly one high card = %f\n", successes / 100000.0);
}

But the combinatrics are wrong in two ways:

  • There are only 31 "low" cards in the pack so the expression should be
21   31   30   29   28   27
__ . __ . __ . __ . __ . __   = 0.02921

52   51   50   49   48   47
  • Secondly, any of the 6 draws can be a "high", not just the first, so multiply the chance by 6.
0.02921 * 6 = 0.1752

Upvotes: 1

Related Questions