Why do I have to define the top-level parameter in JAGS, and how?

Question

According to the user manual of r-jags (section Compilation):

Any node that is used on the right hand side of a relation, but is not defined on the left hand side of any relation, is assumed to be a constant node. Its value must be supplied in the data file.

But it is weird, many probabilistic graph models contains many top-level parameters to be inferred. And that's what BN means to do, isn't it? So why do I need to define the value of the top-level parameter first? And what should I do when I want to implement the model like LDA, which has topic-distribution prior a and word-distribution beta that are unknown? Please tell me if I have said anything wrong.

Matt Denwood · Accepted Answer

If you want to make inference about a parameter, then by definition this is NOT a top-level parameter. If you want to infer something about a parameter then you have to put a prior on it, in which case the hyper-parameters in the prior are the top-level parameters. For example:

Count ~ dpois(lambda)
lambda <- 10

Means that lambda is the top-level parameter, and cannot be inferred.

Count ~ dpois(lambda)
lambda ~ dgamma(0.001, 0.001)

Means that lambda is inferred, and the hyper-parameters of the gamma prior are the top-level parameters. To see this more explicitly, notice that this syntax is equivalent:

Count ~ dpois(lambda)
lambda ~ dgamma(shape, rate)
shape <- 0.001
rate <- 0.001

The shape and rate parameters could also be specified in the data if you prefer, but that would be a bit unusual.

Choice of a reasonable prior distribution for these parameters is not always straightforward, but is an integral part of any Bayesian analysis. Don't just assume that a prior with large variance is minimally informative without thinking about it and/or testing it.

Matt

Why do I have to define the top-level parameter in JAGS, and how?

Answers (1)

Related Questions