Reputation: 133
We have some data and a probabilistic model with latent variables, we want to estimate the posterior distribution after seeing the data. Usually this p(x|z) is hard to compute, so we use variational inference, or MCMC.
What I don't understand is why MCMC plays an essential rule. MCMC can only draw samples. But we may want to fit the model parameters other than just draw samples. For example, for $p(x,\theta|z)$ we may want to fit the parameter $\theta$, only draw samples cannot satisfy our need.
My question is, since MCMC can only draw samples of the posterior, why it is important?
Upvotes: 2
Views: 817
Reputation: 361
your central assumption is incorrect. we use mcmc because p(z) (ie normalizing constant) is often hard to compute - not because p(x|z) is hard to compute.
in these situations, where the normalizing constant is hard to compute, the posterior distribution is not a true distribution, as it does not sum to one, which makes integration impossible.
mcmc is useful in these situations: mcmc allows you to integrate (or, approximate) posterior distributions without normalization constants.
Upvotes: 1
Reputation: 131
The Monte Carlo makes a sense since it abides by a statistics law - Law of Large numbers click here which states that the mean and the variance of the samples essentially converges at the mean and the variance of the population itself if the amount of sample is sufficiently large.
The next question comes is how large the sample size should be?
This is given by the below formulae,
N ≥ 0.25 * (Zα/2/ϵ)^2
N - denotes the sample size,
α - the width which has the maximum probability,
ϵ - the error allowed which comes from chebyshev's inequality.
Instead of drawing lot of samples my suggestion would be to figure out the desired sample size by means of fitting in the parameters.
Upvotes: 3