Reputation: 65
In a coin flip, we would like to compute p(theta|Data), where theta is the underlying parameter.
Here is the code implementation:
a = 1 # a and b are the beta distribution's parameters
b= 1
num = 1e5 #Number of candidate theta values
z= 17220 #Number of heads
N= 143293 #Total number of flips
Theta = seq(0.07,0.12, length.out= num)
prior = dbeta(Theta, a,b) #Compute the prior at each value
likelihood = Theta^z *(1-Theta)^(N-z)
pData = likelihood * prior /sum(likelihood * prior) #Compute evidence
posterior = likelihood*prior / pData
I would like to verify that the posterior is equal to the analytical solutions beta(a+z, N-z+b). However, since the likelihood equals 0 because the theta values are small, the probability of the evidence is a Nan and so is the posterior.
I have tried computing the log likelihood but it gives me a large negative number which is equal to 0 when taking the exponential.
Theta = seq(0.07,0.12, by= num_steps)
lprior = log(dbeta(Theta, a,b)) #Compute the log prior at each value
llikelihood = log(Theta)*z + log(1-Theta)*(N-z) #log likelihood
lpData = llikelihood + lprior - sum(llikelihood + lprior) #compute evidence
lposterior = llikelihood+lprior - lpData
posterior = exp(lposterior)
plot(Theta, posterior, type="l")
lines(Theta, exp(llikelihood), type="l")
lines(Theta, exp(lprior), type="l")
If my ultimate goal is to have a nice graph that shows the posterior, likelihood and prior like so
How should I be computing each value?
Upvotes: 1
Views: 790
Reputation: 65
This answer has been provided through the comment section by @JosephClarkMcIntyre.
Here is a summary:
This is the final code, where the prior, likelihood and evidence are in the log space:
a = 1 # a and b are the beta distribution's paramteres
b= 1
num_steps = 1e5
z= 17220 #Number of heads
N= 143293 #Total number of flips
Theta = seq(from=0.07,to=0.12, length.out= num_steps)
lprior = dbeta(Theta, a,b,log=TRUE) #Compute the log prior at each value
llikelihood = log(Theta)*z + log1p(-Theta)*(N-z) #log likelihood
lpData = sum(llikelihood + lprior) #compute log of the evidence
lposterior = llikelihood+lprior - lpData
plot(Theta,log(dbeta(Theta,a+z,N-z+b)))
plot(Theta, lposterior, type="l")
However, the analytical and the computed log posterior are not the same as shown in the graph..
Feel free to comment if you think there is a flaw in this answer or explain why the analytical and computed log posterior are not the same. ^^
Upvotes: 0