Daniel Svozil
Daniel Svozil

Reputation: 85

warning messages in lme4 for survival analysis that did not arise 3 years ago

I am trying to fit a generalized linear mixed-effects model to my data, using the lme4 package.

The data can be described as follows (see example below): Survival data of fish over 28 days. Explanatory variables in the example data set are:

Response variable

My problem is unusual, in that I have fitted the following model before:

   dat.asr3<-glmer(csns~treatment+Day+Region+
      treatment*Region+Day*Region+Day*treatment*Region+
      (1|tub)+(1|aquarium),weights=start, 
      family=binomial, data=data2)  

However, now that I am attempting to re-run the model, to generate analyses for publication, I am getting the following errors with the same model structure and package. Output is listed below:

> Warning messages:  

1: In eval(expr, envir, enclos) : non-integer #successes in a binomial glm!
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 1.59882 (tol = 0.001, component >1)
3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model is nearly unidentifiable: very large eigenvalue
- Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
- Rescale variables?

My understanding is the following:

Warning message 1.

non-integer #success in a binomial glm refers to the proportion format of the csns variable. I have consulted several sources, here included, github, r-help, etc, and all suggested this. The research fellow that assisted me in this analysis 3 years ago, is unreachable. Can it have to do with changes in lme4 package over the last 3 years?

Warning message 2.

I understand this is a problem because there are insufficient data points to fit the model to, particularly at
L-30-1, L-30-2 and L-30-3,
where only two observations are made:
Day 0 csns=1.00 and Day 1 csns=0.00
for all three aquaria. Therefore there is no variability or sufficient data to fit the model to.

Nevertheless, this model in lme4 has worked before, but doesn't run without these warnings now.

Warning message 3

This one is entirely unfamiliar to me. Never seen it before.

Sample data:

Region treatment replicate tub Day stage csns start aquarium  
 N        14         1  13   0    1 1.00   107   N-14-1
 N        14         1  13   1    1 1.00   107   N-14-1
 N        14         1  13   2    1 0.99   107   N-14-1
 N        14         1  13   3    1 0.99   107   N-14-1
 N        14         1  13   4    1 0.99   107   N-14-1
 N        14         1  13   5    1 0.99   107   N-14-1

The data in question 1005cs.csv is available here via we transfer: http://we.tl/ObRKH0owZb

Any help with deciphering this problem, would be greatly appreciated. Also any alternative suggestions for suitable packages or methods to analyse this data would be great too.

Upvotes: 0

Views: 1058

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226761

tl;dr the "non-integer successes" warning is accurate; it's up to you to decide whether fitting a binomial model to these data really makes sense. The other warnings suggest that the fit is a bit unstable, but scaling and centering some of the input variables can make the warnings go away. It's up to you, again, to decide whether the results from these different formulations are different enough for you to worry about ...

data2 <- read.csv("1005cs.csv")
library(lme4)

Fit model (slightly more compact model formulation)

dat.asr3<-glmer(
   csns~Day*Region*treatment+
        (1|tub)+(1|aquarium),
            weights=start, family=binomial, data=data2)

I do get the warnings you report.

Let's take a look at the data:

library(ggplot2); theme_set(theme_bw())
ggplot(data2,aes(Day,csns,colour=factor(treatment)))+
        geom_point(aes(size=start),alpha=0.5)+facet_wrap(~Region)

enter image description here

Nothing obviously problematic here, although it does clearly show that the data are very close to 1 for some treatment combinations, and that the treatment values are far from zero. Let's try scaling & centering some of the input variables:

data2sc <- transform(data2,
                     Day=scale(Day),
                     treatment=scale(treatment))
dat.asr3sc <- update(dat.asr3,data=data2sc)

Now the "very large eigenvalue" warning is gone, but we still have the "non-integer # successes" warning, and a max|grad|=0.082. Let's try another optimizer:

dat.asr3scbobyqa <- update(dat.asr3sc,
                     control=glmerControl(optimizer="bobyqa"))

Now only the "non-integer #successes" warning remains.

d1 <- deviance(dat.asr3)
d2 <- deviance(dat.asr3sc)
d3 <- deviance(dat.asr3scbobyqa)
c(d1,d2,d3)
## [1] 12597.12 12597.31 12597.56

These deviances don't differ by very much (0.44 on the deviance scale is more than could be accounted for by round-off error, but not much difference in goodness of fit); actually, the first model gives the best (lowest) deviance, suggesting that the warnings are false positives ...

resp <- with(data2,csns*start)
plot(table(resp-floor(resp)))

enter image description here

This makes it clear that there really are non-integer responses, so the warning is correct.

Upvotes: 3

Related Questions