Reputation: 3
I want to do power analysis based on preliminary experiments to inform experimental design choices of future experiments.
I am using glmmTMB
to fit a negative binomial model to my data, using fixed predictors + random effects. I previously used simr
to perform poser analysis of models fitted using glmer
, but according to this issue in simr
GitHub, simr
does not support glmmTMB
models.
I am thus trying to find a workaround solution based on glmmTMB
built-in functions. I am planning on doing the following:
simulate_new()
To this end, I am trying to get what the different values in the newparams
argument of simulate_new()
represent. In the simulate_new()
help page is the following example:
sim_count <- simulate_new(~ mined + (1|site),
newdata = Salamanders,
zi = ~ mined,
family = nbinom2,
newparams = list(beta = c(2, 1),
betazi = c(-0.5, 0.5), ## logit-linear model for zi
betadisp = log(2), ## log(NB dispersion)
theta = log(1)) ## log(among-site SD)
)
However, when fitting the corresponding model (summary(glmmTMB(count ~ mined + (1|site), data=Salamanders, ziformula=~mined, family=nbinom2))
), I get quite different coefficients:
Family: nbinom2 ( log )
Formula: count ~ mined + (1 | site)
Zero inflation: ~mined
Data: Salamanders
AIC BIC logLik deviance df.resid
1741.4 1768.2 -864.7 1729.4 638
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
site (Intercept) 0.1338 0.3658
Number of obs: 644, groups: site, 23
Dispersion parameter for nbinom2 family (): 0.965
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.5573 0.3711 -1.501 0.133
minedno 1.4674 0.3616 4.059 4.94e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Zero-inflation model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.2404 0.4863 0.494 0.62103
minedno -2.2561 0.7829 -2.882 0.00396 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
It thus seems that the help page from simulate_new()
uses arbitrary values as an example, not values corresponding to coefficients that represent the data?
If I wanted to correctly simulate new data, would the following be correct?
sim_count <- simulate_new(~ mined + (1|site),
newdata = Salamanders,
zi = ~ mined,
family = nbinom2,
newparams = list(beta = c(-0.56, 1.47),
betazi = c(0.24, -2.26), ## logit-linear model for zi
betadisp = log(0.97), ## log(NB dispersion)
theta = log(0.37)) ## log(among-site SD)
Thank you in advance for your help and insights, Cheers, Theo
Upvotes: 0
Views: 19
Reputation: 226742
(Copied from answer on the glmmTMB issues list)
You're correct. Of course, if you want to simulate from an already fitted model, you just need simulate()
— the use case of simulate_new()
is when you want to simulate using different parameters than those from a fitted model, or where you are in a pure-simulation scenario where you don't have any real data to fit a model to in the first place ...
If we fit a model to the simulated response, we get answers that are reasonably close to the values we put in (to be more thorough, we could do this many times and show that the distribution of fitted values is centered on the true values [i.e., there is not much bias]).
set.seed(101)
sim_count <- simulate_new(~ mined + (1|site),
newdata = Salamanders,
zi = ~ mined,
family = nbinom2,
newparams = list(beta = c(2, 1),
betazi = c(-0.5, 0.5), ## logit-linear model for zi
betadisp = log(2), ## log(NB dispersion)
theta = log(1)) ## log(among-site SD)
)
Salamanders$sim_count <- sim_count[[1]]
model <- glmmTMB(sim_count ~ mined + (1|site),
data=Salamanders, ziformula=~mined, family=nbinom2)
coef(summary(model))
$cond
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.648139 0.2572210 6.407481 1.479438e-10
minedno 1.386065 0.3549957 3.904455 9.443779e-05
$zi
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.7163624 0.1509574 -4.745461 2.080317e-06
minedno 0.6331215 0.1863901 3.396755 6.818988e-04
$disp
NULL
Upvotes: 1