tabtimm
tabtimm

Reputation: 431

Error messages when running glmer in R

I am attempting to run two similar generalized linear mixed models in R. Both models have the same input variables for predictors, covariates and random factors, however, response variables differ. Models require the lme4 package. The issue I was having with the second model has been resolved by Ben Bolker.

In the first model, the response variable is biomass weight and family = gaussian.

 global.model <- lmer(ex.drywght ~ forestloss562*forestloss17*roaddenssec*nearestroadprim + 
                            elevation + soilPC1 + soilPC2 +
                            (1|block/fragment), 
                          data = RespPredComb, 
                          family = "gaussian") 

Predictors have the following units: 
    forestloss562 = %,
    forestloss17 = %, 
    roaddenssec = (km/km2) and
    nearestroadprim = (m).

Executing this model brings up the following warning messages:

Warning messages:

1: In glmer(ex.drywght ~ forestloss562 * forestloss17 * roaddenssec * : calling glmer() with family=gaussian (identity link) as a shortcut to lmer() is deprecated; please call lmer() directly

2: Some predictor variables are on very different scales: consider rescaling

I then perform these subsequent steps (following the sequence of steps described in Grueber et al. (2011):

I standardize predictors,

stdz.model <- standardize(global.model, standardize.y = FALSE)

(requires package arm)

use automated model selection with subsets of the supplied ‘global’ model

model.set <- dredge(stdz.model) 

requires package (MuMIn)

Here I get the following warning message:

Warning message:
In dredge(stdz.model2) : comparing models fitted by REML

find the top 2 AIC models and

top.models <- get.models(model.set, subset = delta < 2)

do model averaging

model.avg(model.set, subset = delta < 2)

Here, I get this error message:

Error in apply(apply(z, 2L, is.na), 2, all) : 
  dim(X) must have a positive length

Any advice on how to possibly fix this error would be very much appreciated.

In the second model, the response variable is richness, family is poisson.

global.model <- glmer(ex.richness ~ forestloss562*forestloss17*roaddenssec*nearestroadprim + 
                        elevation + soilPC1 + soilPC2 +
                       (1|block/fragment), 
                     data = mydata, 
                     family = "poisson") 

When I execute the above command I get the following error and warning messages:

Error: (maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate In addition: Warning messages:

1: Some predictor variables are on very different scales: consider rescaling

2: In pwrssUpdate(pp, resp, tolPwrss, GQmat, compDev, fac, verbose) : Cholmod warning 'not positive definite' at file:../Cholesky/t_cholmod_rowfac.c, line 431

3: In pwrssUpdate(pp, resp, tolPwrss, GQmat, compDev, fac, verbose) : Cholmod warning 'not positive definite' at file:../Cholesky/t_cholmod_rowfac.c, line 431

Please find a reproducible subset of my data below:

    structure(list(plot.code = structure(c(1L, 3L, 2L, 4L, 5L, 6L, 
7L), .Label = c("a100m56r", "b1m177r", "c100m56r", "d1f1r", "e1m177r", 
"f1m17r", "lf10m56r"), class = "factor"), site.code = structure(c(1L, 
3L, 2L, 4L, 5L, 6L, 7L), .Label = c("a100m56", "b1m177", "c100m56", 
"d1f1", "e1m177", "f1m17", "lf10m56"), class = "factor"), block = structure(c(1L, 
3L, 2L, 4L, 5L, 6L, 7L), .Label = c("a", "b", "c", "d", "e", 
"f", "lf"), class = "factor"), fragment = structure(c(1L, 3L, 
2L, 4L, 5L, 6L, 7L), .Label = c("a100", "b1", "c100", "d1", "e1", 
"f1", "lf10"), class = "factor"), elevation = c(309L, 342L, 435L, 
495L, 443L, 465L, 421L), forestloss562 = c(25.9, 56.77, 5.32, 
27.4, 24.25, 3.09, 8.06), forestloss17 = c(7.47, 51.93, 79.76, 
70.41, 80.55, 0, 0), roaddenssec = c(2.99, 3.92, 2.61, 1.58, 
1.49, 1.12, 1.16), nearestroadprim = c(438L, 237L, 2637L, 327L, 
655L, 528L, 2473L), soilPC1 = c(0.31, -0.08, 1.67, 2.39, -1.33, 
-1.84, -0.25), soilPC2 = c(0.4, 0.41, -0.16, 0.15, 0.03, -0.73, 
0.51), ex.richness = c(0L, 0L, 1L, 7L, 0L, 0L, 1L), ex.drywght = c(0, 
0, 1.255, 200.2825, 0, 0, 0.04)), .Names = c("plot.code", "site.code", 
"block", "fragment", "elevation", "forestloss562", "forestloss17", 
"roaddenssec", "nearestroadprim", "soilPC1", "soilPC2", "ex.richness", 
"ex.drywght"), class = "data.frame", row.names = c(NA, -7L))

Upvotes: 0

Views: 18294

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226761

tl;dr you need to standardize your variables before you fit the model, for greater numerical stability. I also have a few comments about the advisability of what you're doing, but I'll save them for the end ...

source("SO_glmer_26904580_data.R")
library("arm")
library("lme4")
library("MuMIn")

Try the first fit:

pmod <- glmer(ex.richness ~
              forestloss562*forestloss17*roaddenssec*nearestroadprim + 
                  elevation + soilPC1 + soilPC2 +
                      (1|block/fragment), 
              data = dat, 
              family = "poisson")

This fails, as reported above. However, I get a warning you didn't report above:

## 1: Some predictor variables are on very different scales: consider rescaling 

which provides a clue.

Scaling numeric parameters:

pvars <- c("forestloss562","forestloss17",
           "roaddenssec","nearestroadprim",
           "elevation","soilPC1","soilPC2")
datsc <- dat
datsc[pvars] <- lapply(datsc[pvars],scale)

Try again:

pmod <- glmer(ex.richness ~
              forestloss562*forestloss17*roaddenssec*nearestroadprim + 
                  elevation + soilPC1 + soilPC2 +
                      (1|block/fragment), 
              data = datsc, 
              family = "poisson",
              na.action="na.fail")

This works, although we get a warning message about a too-large gradient -- I think this is actually ignorable (we're still working on getting these error sensitivity thresholds right).

As far as I can tell, the following lines seem to be working:

stdz.model <- standardize(pmod, standardize.y = FALSE)
## increases max gradient -- larger warning
model.set <- dredge(stdz.model)  ## slow, but running ...

Here are my comments about advisability:

  • Not even counting random-effects parameters, you have only 8x as many observations as predictor variables. This is pushing it (a rule of thumb is that you should have 10-20 observations per parameter).
 nrow(datsc) ## 159
 ncol(getME(pmod,"X"))  ## 19
  • Dredging/multi-model-averaging over models with and without interactions can be dangerous -- at the very least, centering continuous variables is necessary in order for it to be interpretable. (I don't know whether dredge does anything to try to be sensible in this case.)

I also tried glmmLasso on this problem -- it ended up shrinking away all of the fixed effect terms ...

library("glmmLasso")
datsc$bf <- interaction(datsc$block,datsc$fragment)
glmmLasso(ex.richness ~
          forestloss562+forestloss17+roaddenssec+nearestroadprim + 
              elevation + soilPC1 + soilPC2,
          rnd=list(block=~1,bf=~1),
              data = datsc, 
              family = poisson(),
          lambda=500)

Upvotes: 9

Related Questions