Blundering Ecologist
Blundering Ecologist

Reputation: 1315

Extracting posterior modes and credible intervals from glmmTMB output

I normally work with lme4 package, but the glmmTMB package is increasingly becoming better suited to work with highly complicated data (think overdispersion and/or zero-inflation).

Is there a way to extract posterior modes and credible intervals from glmmTMB models, similar to how it is done for lme4 models (example here).

Details:

I am working with count data (available here) that are zero-inflated and overdispersed and have random effects. The package best suited to work with this sort of data is the glmmTMB (details here). (Note two outliers: euc0==78 and np_other_grass==20).

The data looks like this:

euc0 ea_grass ep_grass np_grass np_other_grass month year precip season   prop_id quad
 3      5.7      0.0     16.7            4.0     7 2006    526 Winter    Barlow    1
 0      6.7      0.0     28.3            0.0     7 2006    525 Winter    Barlow    2
 0      2.3      0.0      3.3            0.0     7 2006    524 Winter    Barlow    3
 0      1.7      0.0     13.3            0.0     7 2006    845 Winter    Blaber    4
 0      5.7      0.0     45.0            0.0     7 2006    817 Winter    Blaber    5
 0     11.7      1.7     46.7            0.0     7 2006    607 Winter    DClark    3

The glmmTMB model:

model<-glmmTMB(euc0 ~ ea_grass + ep_grass + np_grass + np_other_grass + (1|prop_id), data = euc, family= nbinom2) #nbimom2 lets var increases quadratically
summary(model)
confint(model) #this gives the confidence intervals

How I would normally extract the posterior mode and credible intervals for a lmer/glmer model:

#extracting model estimates and credible intervals
sm.model <-arm::sim(model, n.sim=1000)
smfixef.model = sm.model@fixef
smfixef.model =coda::as.mcmc(smfixef.model)
MCMCglmm::posterior.mode(smfixef.model)  #mode of the distribution
coda::HPDinterval(smfixef.model)  #credible intervals

#among-brood variance
bid<-sm.model@ranef$prop_id[,,1]
bvar<-as.vector(apply(bid, 1, var)) #between brood variance posterior distribution
bvar<-coda::as.mcmc(bvar)
MCMCglmm::posterior.mode(bvar) #mode of the distribution
coda::HPDinterval(bvar) #credible intervals

Upvotes: 4

Views: 1076

Answers (2)

Daniel
Daniel

Reputation: 7832

I think Ben has already answered your question, so my answer does not add much to the discussion... Maybe just one thing, as you wrote in your comments that you're interested in the within- and between-group variances. You can get these information via parameters::random_parameters() (if I did not misunderstand what you were looking for). See example below that first generates simulated samples from a multivariate normal (just like in Ben's example), and later gives you a summary of the random effect variances...

library(readr)
library(glmmTMB)
library(parameters)
library(bayestestR)
library(insight)

euc_data <- read_csv("D:/Downloads/euc_data.csv")
model <-
  glmmTMB(
    euc0 ~ ea_grass + ep_grass + np_grass + np_other_grass + (1 | prop_id),
    data = euc_data,
    family = nbinom2
  ) #nbimom2 lets var increases quadratically


# generate samples
samples <- parameters::simulate_model(model)
#> Model has no zero-inflation component. Simulating from conditional parameters.


# describe samples
bayestestR::describe_posterior(samples)
#> # Description of Posterior Distributions
#> 
#> Parameter      | Median |           89% CI |    pd |        89% ROPE | % in ROPE
#> --------------------------------------------------------------------------------
#> (Intercept)    | -1.072 | [-2.183, -0.057] | 0.944 | [-0.100, 0.100] |     1.122
#> ea_grass       | -0.001 | [-0.033,  0.029] | 0.525 | [-0.100, 0.100] |   100.000
#> ep_grass       | -0.050 | [-0.130,  0.038] | 0.839 | [-0.100, 0.100] |    85.297
#> np_grass       | -0.020 | [-0.054,  0.012] | 0.836 | [-0.100, 0.100] |   100.000
#> np_other_grass | -0.002 | [-0.362,  0.320] | 0.501 | [-0.100, 0.100] |    38.945


# or directly get summary of sample description
sp <- parameters::simulate_parameters(model, ci = .95, ci_method = "hdi", test = c("pd", "p_map"))
sp
#> Model has no zero-inflation component. Simulating from conditional parameters.
#> # Description of Posterior Distributions
#> 
#> Parameter      | Coefficient | p_MAP |    pd |              CI
#> --------------------------------------------------------------
#> (Intercept)    |      -1.037 | 0.281 | 0.933 | [-2.305, 0.282]
#> ea_grass       |      -0.001 | 0.973 | 0.511 | [-0.042, 0.037]
#> ep_grass       |      -0.054 | 0.553 | 0.842 | [-0.160, 0.047]
#> np_grass       |      -0.019 | 0.621 | 0.802 | [-0.057, 0.023]
#> np_other_grass |       0.019 | 0.999 | 0.540 | [-0.386, 0.450]

plot(sp) + see::theme_modern()
#> Model has no zero-inflation component. Simulating from conditional parameters.

# random effect variances
parameters::random_parameters(model)
#> # Random Effects
#> 
#> Within-Group Variance         2.92 (1.71)
#> Between-Group Variance
#>   Random Intercept (prop_id)   2.1 (1.45)
#> N (groups per factor)
#>   prop_id                       18
#> Observations                   346

insight::get_variance(model)
#> Warning: mu of 0.2 is too close to zero, estimate of random effect variances may be unreliable.
#> $var.fixed
#> [1] 0.3056285
#> 
#> $var.random
#> [1] 2.104233
#> 
#> $var.residual
#> [1] 2.91602
#> 
#> $var.distribution
#> [1] 2.91602
#> 
#> $var.dispersion
#> [1] 0
#> 
#> $var.intercept
#>  prop_id 
#> 2.104233

Created on 2020-05-26 by the reprex package (v0.3.0)

Upvotes: 2

Ben Bolker
Ben Bolker

Reputation: 226741

Most of an answer:

  1. Getting a multivariate Normal sample of the parameters of the conditional model is pretty easy (I think this is what arm::sim() is doing.
library(MASS)
pp <- fixef(model)$cond
vv <- vcov(model)$cond
samp <- MASS::mvrnorm(1000, mu=pp, Sigma=vv)

(then use the rest of your method above).

  1. I'm a little skeptical that your second example is doing what you want it to do. The variance of the conditional modes is not necessarily a good estimate of the between-group variance (e.g. see here). Furthermore, I'm nervous about the half-assed-Bayesian approach (e.g., why no priors? Why look at the posterior mode, which is rarely a meaningful value in a Bayesian context?) although I do sometimes use similar approaches myself!) However, it's not too hard to use glmmTMB results to do a proper Markov chain Monte Carlo analysis:
library(tmbstan)
library(rstan)
library(coda)
library(emdbook) ## for lump.mcmc.list(), or use runjags::combine.mcmc()

t2 <- system.time(m2 <- tmbstan(model$obj))
m3 <- rstan::As.mcmc.list(m2)
lattice::xyplot(m3,layout=c(5,6))
m4 <- emdbook::lump.mcmc.list(m3)
coda::HPDinterval(m4)

It may be helpful to know that the theta column of m4 is the log of the among-group standard standard deviation ...

(See vignette("mcmc", package="glmmTMB") for a little bit more information ...)

Upvotes: 3

Related Questions