Ian_Fin
Ian_Fin

Reputation: 161

Hyperpriors for hierarchical models with Stan

I'm looking to fit a model to estimate multiple probabilities for binomial data with Stan. I was using beta priors for each probability, but I've been reading about using hyperpriors to pool information and encourage shrinkage on the estimates.

I've seen this example to define the hyperprior in pymc, but I'm not sure how to do something similar with Stan

@pymc.stochastic(dtype=np.float64)
def beta_priors(value=[1.0, 1.0]):
    a, b = value
    if a <= 0 or b <= 0:
        return -np.inf
    else:
        return np.log(np.power((a + b), -2.5))

a = beta_priors[0]
b = beta_priors[1]

With a and b then being used as parameters for the beta prior.

Can anybody give me any pointers on how something similar would be done with Stan?

Upvotes: 3

Views: 1249

Answers (2)

Bob Carpenter
Bob Carpenter

Reputation: 3753

To properly normalize that, you need a Pareto distribution. For example, if you want a distribution p(a, b) ∝ (a + b)^(-2.5), you can use

a + b ~ pareto(L, 1.5);

where a + b > L. There's no way to normalize the density with support for all values greater than or equal to zero---it needs a finite L as a lower bound. There's a discussion of using just this prior as the count component of a hierarchical prior for a simplex.

If a and b are parameters, they can either both be constrained to be positive, or you can leave a unconstrained and declare

real<lower = L - a> b;

to insure a + b > L. L can be a small constant or something more reasonable given your knowledge of a and b.

You should be careful because this will not identify a + b. We use this construction as a hierarchical prior for simplexes as:

parameters {
  real<lower = 1> kappa;
  real<lower = 0, upper = 1> phi;
  vector<lower = 0, upper = 1>[K] theta;

model {
  kappa ~ pareto(1, 1.5);  // power law prior
  phi ~ beta(a, b);  // choose your prior for theta
  theta ~ beta(kappa * phi, kappa * (1 - phi));  // vectorized

There's an extended example in my Stan case study of repeated binary trials, which is reachable from the case studies page on the Stan web site (the case study directory is currently linked under the documentation link from the users tab).

Upvotes: 3

Ian_Fin
Ian_Fin

Reputation: 161

Following suggestions in the comments I'm not sure that I will follow this approach, but for reference I thought I'd at least post the answer to my question of how this could be accomplished in Stan.

After some asking around on Stan Discourses and further investigation I found that the solution was to set a custom density distribution and use the target += syntax. So the equivalent for Stan of the example for pymc would be:

parameters {
  real<lower=0> a;
  real<lower=0> b;
  real<lower=0,upper=1> p;
  ...
}

model {
  target += log((a + b)^-2.5);

  p ~ beta(a,b)
  ...
}

Upvotes: 0

Related Questions