Reputation: 1823
Random coefficient Poisson models are rather difficult to fit, there tends to be some variability in parameter estimates between lme4 and glmmADMB. But in my case:
# Packages
library(lme4)
library(glmmADMB)
#Open my dataset
myds<-read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_glmm_dataset.csv")
str(myds)
# 'data.frame': 526 obs. of 10 variables:
# $ Bioma : chr "Pampa" "Pampa" "Pampa" "Pampa" ...
# $ estacao : chr "verao" "verao" "verao" "verao" ...
# $ ciclo : chr "1°" "1°" "1°" "1°" ...
# $ Hour : int 22 23 0 1 2 3 4 5 6 7 ...
# $ anthill : num 23.5 23.5 23.5 23.5 23.5 ...
# $ formigueiro: int 2 2 2 2 2 2 2 2 2 2 ...
# $ ladenant : int 34 39 29 25 20 31 16 28 21 12 ...
# $ unladen : int 271 258 298 317 316 253 185 182 116 165 ...
# $ UR : num 65.7 69 71.3 75.8 78.1 ...
# $ temp : num 24.3 24.3 24 23.7 23.1 ...
I have a number of insects (ladenant
) in the function of biome(Bioma
), temperature (temp
) and humidity(UR
), but formiguieros
is pseudoreplication. Then I try to model the relationship using lme4 and glmmADMB.
First I try lme4:
m.laden.1 <- glmer(ladenant ~ Bioma + poly(temp,2) + UR + (1 | formigueiro), data = DataBase, family = poisson(link = "log"))
summary(m.laden.1)
# Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
# Family: poisson ( log )
# Formula: ladenant ~ Bioma + poly(temp, 2) + UR + (1 | formigueiro)
# Data: DataBase
# AIC BIC logLik deviance df.resid
# 21585.9 21615.8 -10786.0 21571.9 519
# Scaled residuals:
# Min 1Q Median 3Q Max
# -10.607 -4.245 -1.976 2.906 38.242
# Random effects:
# Groups Name Variance Std.Dev.
# formigueiro (Intercept) 0.02049 0.1432
# Number of obs: 526, groups: formigueiro, 5
# Fixed effects:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) 0.7379495 0.0976701 7.556 4.17e-14 ***
# BiomaTransition 1.3978383 0.0209623 66.684 < 2e-16 ***
# BiomaPampa -0.1256759 0.0268164 -4.687 2.78e-06 ***
# poly(temp, 2)1 7.1035195 0.2079550 34.159 < 2e-16 ***
# poly(temp, 2)2 -7.2900687 0.2629908 -27.720 < 2e-16 ***
# UR 0.0302810 0.0008029 37.717 < 2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# Correlation of Fixed Effects:
# (Intr) BmTrns BimPmp p(,2)1 p(,2)2
# BiomaTrnstn -0.586
# BiomaPampa -0.199 0.352
# ply(tmp,2)1 -0.208 0.267 0.312
# ply(tmp,2)2 -0.191 0.085 -0.175 -0.039
# UR -0.746 0.709 0.188 0.230 0.316
# optimizer (Nelder_Mead) convergence code: 0 (OK)
# Model is nearly unidentifiable: very large eigenvalue
# - Rescale variables?
Second I try glmmADMB:
m.laden.2 <- glmmadmb(ladenant ~ Bioma + poly(temp,2) + UR + (1 | formigueiro), data = DataBase, family = "poisson", link = "log")
summary(m.laden.2)
# Call:
# glmmadmb(formula = ladenant ~ Bioma + poly(temp, 2) + UR + (1 |
# formigueiro), data = DataBase, family = "poisson", link = "log")
# AIC: 12033.9
# Coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) 1.52390 0.26923 5.66 1.5e-08 ***
# BiomaTransition 0.23967 0.08878 2.70 0.0069 **
# BiomaPampa 0.09680 0.05198 1.86 0.0626 .
# poly(temp, 2)1 -0.38754 0.55678 -0.70 0.4864
# poly(temp, 2)2 -1.16028 0.39608 -2.93 0.0034 **
# UR 0.01560 0.00261 5.97 2.4e-09 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# Number of observations: total=526, formigueiro=5
# Random effect variance(s):
# Group=formigueiro
# Variance StdDev
# (Intercept) 0.07497 0.2738
Despite the between different statistical packages and estimations, the models are completely different significance levels in the case of Biome
variable. My question is,
where is any other approach that can be used to compare the results and choose a final model?
Thanks in advance.
Upvotes: 1
Views: 1791
Reputation: 226192
I got a little bit carried away. tl;dr as pointed out in comments, it's hard to get glmmADMB
to work with a Poisson model, but a model with overdispersion (e.g. negative binomial) is clearly a lot better. Furthermore, you should probably incorporate some aspect of random slopes in the model ...
Packages (colorblindr
is optional):
library(lme4)
library(glmmADMB)
library(glmmTMB)
library(broom.mixed)
library(tidyverse) ## ggplot2, dplyr, tidyr, purrr ...
library(colorblindr) ## remotes::install_github("clauswilke/colorblindr")
theme_set(theme_bw())
Get data: standardize input variables so we can easily make a sensible coefficient plot
## read.csv doesn't work for me out of the box, locale/encoding issues
myds <- readr::read_csv("my_glmm_dataset.csv") %>%
mutate(across(formigueiro, as.factor),
across(c(UR, temp), ~ drop(scale(.))))
Formulas and models:
form <- ladenant ~ Bioma + poly(temp,2) + UR + (1 | formigueiro)
## random slopes, all independent (also tried with correlations (| instead
## of ||), but fails)
form_x <- ladenant ~ Bioma + poly(temp,2) + UR + (1 + UR + poly(temp,2) || formigueiro)
glmer_pois <- glmer(form, data = myds, family = poisson(link = "log"))
## fails
glmmADMB_pois <- try(glmmadmb(form, data = myds, family = "poisson"))
## fails ("Parameters were estimated, but standard errors were not:
## the most likely problem is that the curvature at MLE was zero or negative"
glmmTMB_pois <- glmmTMB(form, data = myds, family = poisson)
glmer_nb2 <- glmer.nb(form, data = myds)
glmmADMB_nb2 <- glmmadmb(form, data = myds, family = "nbinom2")
glmmTMB_nb2 <- update(glmmTMB_pois, family = "nbinom2")
glmmTMB_nb1 <- update(glmmTMB_pois, family = "nbinom1")
glmmTMB_nb2ext <- update(glmmTMB_nb2, formula = form_x)
Put it all together:
modList <- tibble::lst(glmer_pois, glmmTMB_pois, glmer_nb2, glmmADMB_nb2, glmmTMB_nb2,
glmmTMB_nb1, glmmTMB_nb2ext)
bbmle::AICtab(modList)
dAIC df
glmmTMB_nb2ext 0.0 11
glmer_nb2 27.0 8
glmmADMB_nb2 27.1 8
glmmTMB_nb2 27.1 8
glmmTMB_nb1 79.5 8
glmmTMB_pois 16658.0 7
glmer_pois 16658.0 7
The 'nb2' models are all OK, but the random-slopes model is considerably better.
Coefficient plots, including the fixed effects from all methods:
tt <- (purrr::map_dfr(modList, tidy, effects = "fixed", conf.int = TRUE,
.id = "model") |>
dplyr::filter(term != "(Intercept)") |>
tidyr::separate(model, into = c("platform", "distrib"))
)
ggplot(tt, aes(y = term, x = estimate, xmin = conf.low, xmax = conf.high,
colour = platform, shape = distrib)) +
geom_pointrange(position = position_dodge(width = 0.25)) +
geom_vline(xintercept = 0, lty =2) +
scale_colour_OkabeIto()
Upvotes: 2