Salma Bouzid
Salma Bouzid

Reputation: 65

Compute the posterior probability given a Bernoulli distributed likelihood

In a coin flip, we would like to compute p(theta|Data), where theta is the underlying parameter.

Here is the code implementation:

    a = 1  # a and b are the beta distribution's parameters
    b= 1
    num = 1e5 #Number of candidate theta values
    z= 17220 #Number of heads
    N= 143293 #Total number of flips

    Theta = seq(0.07,0.12, length.out= num)
    prior = dbeta(Theta, a,b) #Compute the prior at each value 

    likelihood = Theta^z *(1-Theta)^(N-z)

    pData = likelihood * prior /sum(likelihood * prior) #Compute evidence
    posterior = likelihood*prior / pData

I would like to verify that the posterior is equal to the analytical solutions beta(a+z, N-z+b). However, since the likelihood equals 0 because the theta values are small, the probability of the evidence is a Nan and so is the posterior.

I have tried computing the log likelihood but it gives me a large negative number which is equal to 0 when taking the exponential.

 Theta = seq(0.07,0.12, by= num_steps)
 lprior = log(dbeta(Theta, a,b)) #Compute the log prior at each value 

 llikelihood = log(Theta)*z + log(1-Theta)*(N-z) #log likelihood

 lpData = llikelihood + lprior - sum(llikelihood + lprior) #compute evidence

  lposterior = llikelihood+lprior - lpData
  posterior = exp(lposterior)
  plot(Theta, posterior, type="l")
  lines(Theta, exp(llikelihood), type="l")
  lines(Theta, exp(lprior), type="l")

If my ultimate goal is to have a nice graph that shows the posterior, likelihood and prior like so

this

How should I be computing each value?

Upvotes: 1

Views: 790

Answers (1)

Salma Bouzid
Salma Bouzid

Reputation: 65

This answer has been provided through the comment section by @JosephClarkMcIntyre.

Here is a summary:

  • In a Bernoulli trial, when N -the total number of trials- and z -the total number of success are large and the underlying parameter theta is small, it is better to only operate in the log space and never take the exponential.
  • Moreover, since the log function is increasing, comparing the log posteriors of two distributions is similar to comparing the posterior.
  • The above implementation was wrong because the formula for computing the evidence is not correct. p(evidence) = sum(likelihood*prior), p(log_evidence)= sum(log_likelihood +log_prior)

This is the final code, where the prior, likelihood and evidence are in the log space:

  a = 1  # a and b are the beta distribution's paramteres
  b= 1
  num_steps = 1e5
  z= 17220 #Number of heads
  N= 143293 #Total number of flips

  Theta = seq(from=0.07,to=0.12, length.out= num_steps)
  lprior = dbeta(Theta, a,b,log=TRUE) #Compute the log prior at each value 

  llikelihood = log(Theta)*z + log1p(-Theta)*(N-z) #log likelihood

  lpData = sum(llikelihood + lprior) #compute log of the evidence

  lposterior = llikelihood+lprior - lpData
  plot(Theta,log(dbeta(Theta,a+z,N-z+b)))
  plot(Theta, lposterior, type="l")

However, the analytical and the computed log posterior are not the same as shown in the graph..

Feel free to comment if you think there is a flaw in this answer or explain why the analytical and computed log posterior are not the same. ^^

Upvotes: 0

Related Questions