Reputation: 109
I am running a binomial logistic regression using lme4
, which is constructed as follows:
m2_DL <- glmer(Dest ~ Pb_on * Pb_type + Dest_tree + ZF_Dest + (1|Site), data = DF_DL_3, family = binomial(link = "logit"))
All variables are factors save for ZF_Dest, which is numerical. The full model (m1) includes one extra numerical variable and works fine. This second model however fails to converge with warning message:
fixed-effect model matrix is rank deficient so dropping 6 columns / coefficients
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00144561 (tol = 0.001, component 1)
The rank deficiency is expected in my case and (I assume) not relevant to the problem here. After looking online I found that I could try to add the following argument to my glmer: control = glmerControl(optimizer = "bobyqa")
, which indeed resolves the issue.
However, I wished to better understand what this function did and if I could still compare my models if only my second model has this extra argument. This brought me to this question on SO. One of the comments on this question states: "you did not use a different optimizer (bobyqa is the default for glmer)". But if this is the case then how come adding the argument resolved the warning, as I did not increase the number of iterations as is sometimes also done?
From the answer on that question I understood that if the difference of the likelihoods of both models is close I indicate a 'false positive' of the convergence error, which again is the case for my models. I also read help(converge)
as recommended and used the allFit function indicated there.
This outputs:
bobyqa : fixed-effect model matrix is rank deficient so dropping 6 columns / coefficients
[OK]
Nelder_Mead : fixed-effect model matrix is rank deficient so dropping 6 columns / coefficients
[OK]
nlminbwrap : fixed-effect model matrix is rank deficient so dropping 6 columns / coefficients
[OK]
nmkbw : fixed-effect model matrix is rank deficient so dropping 6 columns / coefficients
[OK]
optimx.L-BFGS-B : fixed-effect model matrix is rank deficient so dropping 6 columns / coefficients
[OK]
nloptwrap.NLOPT_LN_NELDERMEAD : fixed-effect model matrix is rank deficient so dropping 6 columns / coefficients
[OK]
nloptwrap.NLOPT_LN_BOBYQA : fixed-effect model matrix is rank deficient so dropping 6 columns / coefficients
[OK]
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.0013785 (tol = 0.001, component 1)
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00904036 (tol = 0.001, component 1)
3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00904036 (tol = 0.001, component 1)
and the object containing the results of allFit contains:
original model:
Dest ~ Pb_on * Pb_type + Dest_tree + ZF_Dest + (1 | Site)
data: DF_DL_3
optimizers (7): bobyqa, Nelder_Mead, nlminbwrap, nmkbw, optimx.L-BFGS-B, nloptwrap.NLOPT_LN_N...
1 optimizer(s) failed
differences in negative log-likelihoods:
max= 6.65e-05 ; std dev= 3.41e-05
Here you can see that 1 optimizer failed and I get several warnings when using allFit
. Now I understand that not all optimizers are made equal and some won't work for certain data. The problem is that I don't understand how to interpret the outcome of this allFit
function, and I am not sure which optimizer failed and if I should be worried. So in summary, my questions:
How could the BOBYQA optimizer resolve the issue if this is indeed the default optimizer?
How should I interpret the outcome of allFit
.
What is the best way to resolve the warning for the second model. Is simply adding control = glmerControl(optimizer = "bobyqa")
suitable?
If I add this argument to this model, can I then still fairly compare this model to others? (Note: I use the anova
function in order to compare models).
Thanks
EDIT 6-3-2020: As per Roland's request here is the summary of the allFit object:
$which.OK
bobyqa Nelder_Mead nlminbwrap nmkbw
TRUE TRUE TRUE TRUE
optimx.L-BFGS-B nloptwrap.NLOPT_LN_NELDERMEAD nloptwrap.NLOPT_LN_BOBYQA
FALSE TRUE TRUE
$msgs
$msgs$bobyqa
NULL
$msgs$Nelder_Mead
NULL
$msgs$nlminbwrap
NULL
$msgs$nmkbw
[1] "Model failed to converge with max|grad| = 0.0013785 (tol = 0.001, component 1)"
$msgs$nloptwrap.NLOPT_LN_NELDERMEAD
[1] "Model failed to converge with max|grad| = 0.00904036 (tol = 0.001, component 1)"
$msgs$nloptwrap.NLOPT_LN_BOBYQA
[1] "Model failed to converge with max|grad| = 0.00904036 (tol = 0.001, component 1)"
$fixef
(Intercept) Pb_onL Pb_onN Pb_onS Pb_typeNG Pb_typeSong Dest_treeL ZF_Dest
bobyqa 0.4421569 0.5633434 0.1788602 -0.2046165 0.1777216 -0.1115736 0.5845617 0.01578428
Nelder_Mead 0.4420709 0.5633071 0.1788488 -0.2046262 0.1777301 -0.1115980 0.5845859 0.01578563
nlminbwrap 0.4421020 0.5633365 0.1788603 -0.2046102 0.1777225 -0.1115746 0.5845682 0.01578510
nmkbw 0.4423417 0.5635447 0.1791150 -0.2045143 0.1781367 -0.1114576 0.5844313 0.01578639
nloptwrap.NLOPT_LN_NELDERMEAD 0.4434285 0.5612392 0.1765273 -0.2063090 0.1773619 -0.1140108 0.5852130 0.01579032
nloptwrap.NLOPT_LN_BOBYQA 0.4434285 0.5612392 0.1765273 -0.2063090 0.1773619 -0.1140108 0.5852130 0.01579032
Pb_onL:Pb_typeNG Pb_onN:Pb_typeNG Pb_onL:Pb_typeSong Pb_onN:Pb_typeSong
bobyqa -1.140386 -0.1635209 -0.5446501 -0.2497627
Nelder_Mead -1.140380 -0.1635345 -0.5445915 -0.2497404
nlminbwrap -1.140389 -0.1635264 -0.5446236 -0.2497616
nmkbw -1.140884 -0.1640907 -0.5448729 -0.2500731
nloptwrap.NLOPT_LN_NELDERMEAD -1.138619 -0.1624441 -0.5413400 -0.2461967
nloptwrap.NLOPT_LN_BOBYQA -1.138619 -0.1624441 -0.5413400 -0.2461967
$llik
bobyqa Nelder_Mead nlminbwrap nmkbw
-826.3143 -826.3143 -826.3143 -826.3143
nloptwrap.NLOPT_LN_NELDERMEAD nloptwrap.NLOPT_LN_BOBYQA
-826.3143 -826.3143
$sdcor
Site.(Intercept)
bobyqa 1.537530
Nelder_Mead 1.537517
nlminbwrap 1.537501
nmkbw 1.537453
nloptwrap.NLOPT_LN_NELDERMEAD 1.538042
nloptwrap.NLOPT_LN_BOBYQA 1.538042
$theta
Site.(Intercept)
bobyqa 1.537530
Nelder_Mead 1.537517
nlminbwrap 1.537501
nmkbw 1.537453
nloptwrap.NLOPT_LN_NELDERMEAD 1.538042
nloptwrap.NLOPT_LN_BOBYQA 1.538042
$times
user.self sys.self elapsed user.child sys.child
bobyqa 6.51 0 6.55 NA NA
Nelder_Mead 3.06 0 3.06 NA NA
nlminbwrap 2.22 0 2.22 NA NA
nmkbw 2.17 0 2.17 NA NA
nloptwrap.NLOPT_LN_NELDERMEAD 0.94 0 0.94 NA NA
nloptwrap.NLOPT_LN_BOBYQA 0.86 0 0.86 NA NA
$feval
bobyqa Nelder_Mead nlminbwrap nmkbw
3098 1274 NA 672
nloptwrap.NLOPT_LN_NELDERMEAD nloptwrap.NLOPT_LN_BOBYQA
97 97
attr(,"class")
[1] "summary.allFit"
Using car::vif(full_model)
to check for collinearity results in:
> car::vif(m1_DD)
GVIF Df GVIF^(1/(2*Df))
Pb_on 17.606990 3 1.612925
Pb_type 20.180212 2 2.119490
ZF_Dest 1.015237 1 1.007589
ZF_Other 1.006738 1 1.003364
Pb_on:Pb_type 96.910527 4 1.771317
As I am not 100% sure how to interpret these, I used this link which suggests I shouldn't be too worried. As one answer states "The rule of GVIF(1/(2×Df))<2 is applied in some publications, which would equal to an ordinary VIF of 4 for one-coefficient variables". So Pb_type might form a problem, but I have often seen up to a VIF of 10 being seen as "acceptable" so I think this is fine.
EDIT 10-3-2020
After some more searching (now that I know what the bobyqa
really does, thanks to meriops!) I came across, this thread. Reading the answer there, I believe that my answers are now as follows:
As per what meriops said: "glmer uses a two-stage optimization by default, with bobyqa for the first stage and Nelder-Mead for the second stage. So when you change the optimizer to bobyqa, you use bobyqa for both stages."
Based on Ben Bolker's answer in this thread, and my own likelihoods differing <0.1, it might mean that the warning was a false positive to begin with. For now I still choose to adjust the optimizers.
Based on the aforementioned thread I am now under the assumption that this is indeed fine.
Similarly, the aforementioned thread suggests this not to be an issue in my case. As my models are indeed nested.
If any of you wish to elaborate your comments or adjust them into full answers then please do so, I will then accept them.
Upvotes: 2
Views: 2564
Reputation: 47
So it seems like you've had a lot of your questions answered here, but I do have a few recommendations:
- Fit all models with the same optimizers
- Center and scale any continuous predictors, in your case you should be wrapping ZF_Dest in scale like this scale(ZF_Dest)
when you call your models. This can help with convergence issues.
Something that has helped me a lot over the past years is Ben Bolker's FAQ page for GLMMs. Lot's of great information there about issues like this.
https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html
Upvotes: 1