Reputation:
There are extensive resources on the internet discussing the famous Birthday Paradox. It is clear to me how you calculate the probability of two people sharing a birthday i.e. P(same) = 1 - P(different)
. However if I ask myself something apparently more simple I stall: firstly, let's say I generate two random birthdays. Getting the same birthday is like tossing a coin. Either the two persons share a birthday (Heads) or they don't share a birthday (Tail). Run this 500 times and the end result (#Heads/500) will somehow be close to 0.5
Q1) But how do I think about this if I generate three random birthdays? How can I estimate the probability then? Obviously my coin analogy won't be applicable.
Q2) once I have figured out the above I will need to scale it up and generate 30 or 50 birthdays. Is there a recommended technique or algorithm to isolate identical birthdays from a large set? Should I put them into arrays and loop through them?
Here's what I think I need:
r = 25 i.e. each trial run generates 25 birthdays
Trial 1 >
3 duplicates: 0
Trial 2 >
3 duplicates: 0
Trial 3 >
3 duplicates: 2
Trial 4 >
3 duplicates: 1
...
T100 >
3 duplicates: 2
estimated probability of 3 persons sharing a birthday in a room of 25 = (0+0+2+1+...+2)/100
Upvotes: 2
Views: 3016
Reputation: 1207
Create an integer array of length 365, initialized to 0. Then generate N (in your case 25) random numbers between 1-365 and increase that number in the array (ie. bdays[random_value]++). Since you are only interested in a collision happening, right after increasing the number in the array check if it is greater than 2 (If it is then there is a second collision, which means there are 3 people with the same birthday). Keep track of collisions and execute this as many times as you wish (1000).
In the end, the ratio of collisions/1000 will be your requested value.
and, no tossing coins analogy is wrong.
Upvotes: 2
Reputation: 507
Sounds like your first task will be to create a method that will generate random birthdays. To keep things simple, you can use the numbers 1-365 to denote unique birthdays.
Store however many random birthdays (2 in the first case more later) in an ArrayList as Strings. You will want to use a loop to call the random number function and store the value in your list.
Then make a function to search the ArrayList for duplicates. If there are any duplicates (no matter how many) then that's a Heads result. If there are no matches then it's a Tails.
Your probabilities will be far different from 50/50 until you get to 20 or so.
Upvotes: 0
Reputation: 28682
Check this similar question and its answers on CrossValidated, but I think it is really worth thinking about the classic Birthday problem again to get the basics.
To the second part of your question: depends on the language you use. I definitely suggest using R to solve a problem like that, as checking identical birthdays in a list/vector/data frame can easily done with a simple unique
call. To run a such simple MC simulation R is again really handy, check the second answer on the link above.
Upvotes: 0