Calculating lottery winnings

Question

I'm running a basic lottery simulation. 6 numbers are chosen between 1:50, at random - twice: (i) the lottery result (ii) the ticket bought. Each ticket costs $2. A person plays the lottery every day, for 25 years (365*25). The ticket is compared to the lottery ticket. The order of the numbers does not matter. I repeat this 25 year-long process, 50 times and for 3 independent repeats. In other words, 50 people are playing the lottery every day for 25 years. And I want to collect data 3 times.

gross_won<-matrix(NA,nrow=3,ncol=50); 
mean_prize<-matrix(NA,nrow=3,ncol=50); 
net_won<-matrix(NA,nrow=3,ncol=50)
for (k in 1:3) { 
  m<-vector()
  for (x in 1:50) { 
    for (i in 1:(365*25)) {
      res<-sample(1:50,6,replace=FALSE) 
      ticket<-sample(1:50,6,replace=FALSE) 
      m[i]<-length(intersect(res,ticket)) 
    }
  winnings<-c(0,0,0,50,200,150000,2000000,m)[match(m, c(0,1,2,3,4,5,6,m))] #Convert no. matches to money won
  gross_won[k,x]<-sum(winnings)
  mean_prize[k,x]<-mean(winnings) 
  net_won[k,x]<-gross_won[k,x]-(2*(365*25)) #Adjust for $2 ticket cost
  }
}
rowMeans(gross_won)
rowMeans(net_won)
rowMeans(mean_prize)

The amount of money won for 0,1,2,3,4,5,6 matches is: 0,0,0,50,200,150000,2000000.

Now, when I run this - individual sets of 25 years tend to yield a net gain of approximately $-9000. However, if I run it with the higher levels of iteration, I tend to gain money. This may or may not be correct. I know that with enough iterations, the low probability events will be seen (i.e. winning 150k or 2m) but I wanted to ask if I have made any obvious errors here in the code that may be creating unexpected issues.

Allan Cameron · Accepted Answer

Your code seems to be a lot more complex than it needs to be. This function simulates the amount of money won or lost with a single lottery ticket:

lottery <- function()
{
  c(0, 0, 0, 50, 200, 15e4, 2e6)[sum(sample(50, 6) %in% sample(50, 6)) + 1] - 2
}

And this function will sum the outcome of as many lotteries as you like:

multi_lottery <- function(times) sum(replicate(times, lottery()))

So we can do 25 years worth like this:

set.seed(69)
multi_lottery(25 * 365)
#> [1] -8950

Which gives us a fairly typical outcome.

However, suppose we had 100 people who did the lottery daily for 25 years:

many_people <- replicate(100, multi_lottery(25 * 365))
many_people
#>   [1]  -6400  -9900 -10100 142050 -10400 142600  -9300  -9600  -8100
#>  [10]  -9000 141500  -9300  -9100  -9500  -7400  -9000  -9950  -9350
#>  [19]  -9000  -9800  -7700  -8900  -7650  -7800 141200 -10500  -9700
#>  [28]  -9000  -8650  -8750 141550  -9500 139550  -7650  -9350  -8800
#>  [37]  -9750  -9150  -8600  -8550  -8150  -9650 142350  -7850  -9000
#>  [46]  -9400 139700  -8850 139750 -10250  -8500 -10250  -9300  -9600
#>  [55]  -9750  -7900  -8600  -9550  -9700  -9650  -9450  -8600  -9800
#>  [64]  -8800 -10050  -9150  -8450  -9050  -9250  -8900  -9000  -9500
#>  [73]  -9200  -9100  -8650  -9400  -8600  -9600  -7800  -6650  -8750
#>  [82]  -9800 -10100 -10850 140200  -9000  -8450  -9700  -9100  -9450
#>  [91]  -8100  -8550  -9050  -8100  -8450  -8250  -8850  -7850 -10100
#> [100]  -9250

Most people are losers - the median change in wealth is indeed -9000, so most of the time you run the simulation you will get around that value.

median(many_people)
#> [1] -9000

But the mean is increased substantially by the occasional big win:

mean(many_people)
#> [1] 5985.5

So after 25 years and 100 people playing every day, the net effect is that the people running the lottery have lost a net amount of $598,550:

sum(many_people)
#> [1] 598550

This is of course spread over 100 * 25 * 365 tickets, so on average the lottery loses

sum(many_people)/(100 * 25 * 365)
#> [1] 0.6559452

Around 65 cents per ticket, in the long run.

Incidentally, this still isn't quite enough to account for the very occasional big win. To do that, we can work out the mathematical expectation, so if you ran the lottery an infinite number of times this is what the net change would be per ticket. First we work out the probability of matching 0 to 6 balls per lottery:

p_0 <- (choose(6, 0) * choose(44, 6))/choose(50, 6)
p_1 <- (choose(6, 1) * choose(44, 5))/choose(50, 6)
p_2 <- (choose(6, 2) * choose(44, 4))/choose(50, 6)
p_3 <- (choose(6, 3) * choose(44, 3))/choose(50, 6)
p_4 <- (choose(6, 4) * choose(44, 2))/choose(50, 6)
p_5 <- (choose(6, 5) * choose(44, 1))/choose(50, 6)
p_6 <- (choose(6, 6) * choose(44, 0))/choose(50, 6)

If we've got our maths right here, the probability should sum to 1:

p_0 + p_1 + p_2 + p_3 + p_4 + p_5 + p_6
#> [1] 1

Now we multiply the payouts (minus the fixed ticket price) by the probabilities. The sum gives us our answer:

outcomes <- c(0, 0, 0, 50, 200, 15e4, 2e6) - 2
sum(outcomes * c(p_0, p_1, p_2, p_3, p_4, p_5, p_6))
#> [1] 1.629922

So actually, in the very long run, the lottery will lose $1.62 per ticket sold.

Calculating lottery winnings

Answers (1)

Related Questions