Serdar
Serdar

Reputation: 35

How to build a Bayesian simulation model for flipping coin three times

Imagine we tossed a biased coin 8 times (we don’t know how biased it is), and we recorded 5 heads (H) to 3 tails (T) so far. What is the probability of that the next 3 tosses will all be tails? In other words, we are wondering the expected probability of having 5Hs and 6Ts after 11th tosses.

I want to build a MCMC simulation model using pyMC3 to find the Bayesian solution. There is also an analytical solution within the Bayesian approach for this problem. So, I will be able to compare the results derived from the simulation, the analytical way as well as the classical frequentest way. Let me briefly explain what I could do so far:

  1. Frequentest solution:

If we consider the probability for a single toss: E(T) = p = (3/8) = 0,375 Then, the ultimate answer is E({T,T,T}) = p^3 = (3/8)^3 = 0,052.

2.1. Bayesian solution with analytical way:

Please assume the unknown parameter “p” represents for the bias of the coin. If we consider the probability for a single toss: E(T) = Integral0-1[p * P(p | H = 5, T = 3) dp] = 0,400 (I calculated the result after some algebraic manipulation) Similarly, the ultimate answer is: E({T,T,T}) = Integral0-1[p^3 * P(p | H = 5, T = 3) dp] = 10/11 = 0,909.

2.2. Bayesian solution with MCMC simulation: When we consider the probability for a single toss, I built the model in pyMC3 as below:

Head: 0 
Tail: 1
data = [0, 0, 0, 0, 0, 1, 1, 1]
import pymc3 as pm

with pm.Model() as coin_flipping:
    p = pm.Uniform(‘p’, lower=0, upper=1)
    y = pm.Bernoulli(‘y’, p=p, observed=data)
    trace = pm.sample(1000)
    pm.traceplot(trace)

After I run this code, I got that the posterior mean is E(T) =0,398 which is very close to the result of analytical solution (0,400). I am happy so far, but this is not the ultimate answer. I need a model that simulate the probability of E({T,T,T}). I appreciate if someone help me on this step.

Upvotes: 2

Views: 1008

Answers (1)

merv
merv

Reputation: 76700

One way is to do this empirically is with PyMC3's posterior predictive sampling. That is, once you have a posterior sampling, you can generate samplings from random parameterizations of the model. The pymc3.sample_posterior_predictive() method will generate new samples the size of your original observed data. Since you are only interested in three flips, we can just ignore the additional flips it generates. For example, if you wanted 10000 random sets of predicted flips, you would do

with pm.Model() as coin_flipping:
    # this is still uniform, but I always prefer Beta for proportions
    p = pm.Beta(‘p’, alpha=1, beta=1)

    pm.Bernoulli(‘y’, p=p, observed=data)

    # chains looked a bit waggly at 1K; 10K looks smoother
    trace = pm.sample(10000, random_seed=2019, chains=4)

    # note this generates (10000, 8) observations
    post_pred = pm.sample_posterior_predictive(trace, samples=10000, random_seed=2019)

To then see how frequent the next three flips are (1,1,1), we can do

np.mean(np.sum(post_pred['y'][:,:3], axis=1) == 3)
# 0.0919

Analytic Solution

In this example, since we have an analytic posterior predictive distribution (Beta-Binomial[k | n, a=4, b=6] - see the Wikipedia table of conjugate distributions for details), we can exactly calculate the probability of observing three tails in the next three flips as follows:

from scipy.special import comb, beta as beta_fn

n, k = 3, 3  # flips, tails
a, b = 4, 6  # 1 + observed tails, 1 + observed heads

comb(n, k) * beta_fn(n + a, n - k + b) / beta_fn(a, b)
# 0.09090909090909091

Note that the beta_fn is the Euler Beta function, as distinct from the Beta distribution.

Upvotes: 1

Related Questions