KateRandall
KateRandall

Reputation: 31

Understanding and fixing false convergence errors in glmmTMB + lme4

I am trying to use glmmTMB to run a number of iterations of my model, but keep getting the same continual error. I have tried to explain my experiment below and inserted the full model I am trying to run.

Background to the experiment

The dependent variable I am trying to model is bacterial 16S gene copy number, used as a proxy for bacterial biomass in this case.

The experimental design is that I have stream sediment from 8 streams that fall along a pollution gradient (impacted to pristine). (Factor 1 = Stream, with 8 levels).

For each of the 8 streams, the following was performed, Sediment was added to 6 trays. 3 of these trays were placed into an artificial stream channel warmed to 13°C, whilst the other 3 were heated to 17°C (Factor 2 = Warming treatment, with 2 levels). There are 16 channels in total and Warming treatment was randomly assigned to a channel.

I then repeatedly measured the 3 trays in each stream channel over four time points (Factor 3 = Day, with 4 levels).

At this moment in time I am treating tray as a true biological replicate and not a pseudorep as the trays are considerably far away from one another in the channels, but this is to be explored.

So just to summarise: the model terms are (all are specified as factors):

The full model I was proposing was,

X4_tmb.nb2<-glmmTMB(CopyNo~Treatment*Stream*Time, family=nbinom2, data=qPCR)   

Even though this version of the model does not include a random effect, I wanted to use the glmmTMB package and not run this using lme4, because I wanted to explore the idea of adding a model component to account for dispersion, and also explore the option of adding tray as a random effect (not sure if this is correct). By running all versions of the model in glmmTMB, I am able to confidently compare their AIC scores. I wouldn't be able to do this if I ran the full model without the dispersion component in lme4 and the others using glmmTMB.

Unfortunately, for most iterations of the full model when using glmmTMB (by this I mean dropping model terms sequentially), I get the same constant warning:

Warning message: In fitTMB(TMBStruc) : Model convergence problem; false convergence (8). See vignette('troubleshooting')

I have tried to understand the error but I am struggling to understand because, the confusing thing is when I run the full model using lme4, it runs with no error.

This is the version of the full model that runs in lme4,

X4 = glm.nb(CopyNo~Treatment*Stream*Time, data = qPCR

As far as I understand from reading https://www.biorxiv.org/content/10.1101/132753v1.full.pdf @ line 225, it is possible to use this package to cross compare between GLMs and GLMMs. Do you know if I have understood this correctly?

I also used the DHARMa package to help validate the models and the version that failed to converge using glmmTMB, pass the KStest, the dispersion test, the outlier test and combined adjusted quantile test, but ideally I do not want the convergence error.

Any help would be greatly appreciated.

Upvotes: 3

Views: 6120

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226007

There's a bunch here.

warning message

It is unfortunately hard to do very much about this: it is a notoriously obscure error message. As suggested on twitter, you can try a different optimizer, e.g. include

 control = glmmTMBControl(optimizer = optim, optArgs = list(method="BFGS"))

in your call. Hopefully this will give a very similar answer (in which case you conclude that the convergence warning was probably a false positive, as different optimizers are unlikely to fail in the same way) without the warning. (You could try method="CG" above as a third alternative.) (Note that there's a minor bug with printing and summarizing output when using alternate optimizers that was fixed recently; you might want to install the development version if you are working on this before the fix propagates to CRAN.)

"lme4" model

The glm.nb() function is not from the lme4 package, it's from the MASS package. If you had random effects in the model you would use glmer.nb(), which is in the lme4 package ... as with the optimizer-switching tests above, if you get similar answers with glmmTMB and glm.nb you can conclude that the warning from glmmTMB (actually, it's from the nlminb() optimizer that glmmTMB calls internally) is probably a false positive.

The simplest way to check that likelihoods/AICs from different packages are commensurate is to fit the same model in both packages, if possible, e.g.

library(MASS)
library(glmmTMB)

quine.nb1 <- glm.nb(Days ~ Sex/(Age + Eth*Lrn), data = quine)
quine.nb2 <- glmmTMB(Days ~ Sex/(Age + Eth*Lrn), data = quine,
                     family=nbinom2)
all.equal(AIC(quine.nb1),AIC(quine.nb2))  ## TRUE

other details

One of the possible problems with your model is that by fitting the full three-way interaction of three categorical variables you're trying to estimate (2*4*8=) 64 parameters, to 64*3=192 observations (if I understand your experimental design correctly). This is both more likely to run into numerical problems (as above) and may give imprecise results. Although some recommend it, I do not personally recommend a model selection approach (all-subsets or sequential, AIC-based or p-value-based); I'd suggest making Stream into a random effect, i.e.

CopyNo ~ Treatment + (Treatment|StreamID) + (1|Time/StreamID)

This fits (1) an overall treatment effect; (2) variation across streams, variation in treatment effect across streams; (3) variation across time and across streams within time points. This only uses 2 (fixed: intercept + treatment) + 3 (intercept variance, treatment variance, and their covariance across streams) + 2 (variance in time and among streams within time). This is not quite the "maximal" model; since both treatments are measured at each time in each stream, the last term could be (Treatment|Time/StreamID), but this would add a lot of model complexity. Since 4 groups is not much for a random effect, you might find that you want to make the last term into Time + (1|Time:StreamID), which will fit Time as fixed and (streams within time) as random ...

Upvotes: 6

Related Questions