esk
esk

Reputation: 1

lme4: "fit warnings: fixed-effect model matrix is rank deficient so dropping 2 columns / coefficients" - how to fix it?

I am trying to run a HLM analysis in R that predicts "ranking" with the rest of the columns as predictors - I want to include interaction effects but I keep getting the warning message "fit warnings: fixed-effect model matrix is rank deficient so dropping 2 columns / coefficients". There is no linear dependence between the predictors so I am wondering whether this is because all of the four predictors used for the interaction terms are binary coded. I already tried to solve the problem with factor() but it doesn't change anything. The code is the following:

version_3_test <- lmer(ranking ~ 1 + ac_fc + npdc_fc + 
    ed_fc + iks_fc + ac_fc:ed_fc + ac_fc:iks_fc + 
    npdc_fc:ed_fc + npdc_fc:iks_fc + bu_ed_fc +
    nat_ed_fc + eng_ed_fc + it_ind_fc + ph_ind_fc + 
    eng_ind_fc + work + age + 
     (1 + ac_fc + npdc_fc + ed_fc + iks_fc| id_ind), 
    data = df_finaldata, REML = FALSE)
summary(version_3_test)

ac_fc, npdc_fc, iks_fc, ed_fc are factor-variables - each with two levels.

Does anyone know how to solve this problem or what else could be the reason for the warning?

I tried interaction terms with other predictors that are not binary coded and this worked.

Upvotes: 0

Views: 566

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226741

There are almost certainly multicollinear columns in your model matrix.

If the four variables ac_fc, npdc_fc, ed_fc, iks_fc represent indicator variables for an exhaustive and mutually exclusive set of categories (i.e. if every observation comes from the set {ac, npdc, ed, iks}), then it's natural that these columns would be multicollinear (because their sum is always 1).

Another possibility is that you have interaction terms in your models where some combinations of levels are missing from the data (e.g. A = {1, 2}, B = {1,2}, you have A*B or A:B in your model, but A=2; B=2 never occurs in your data).

More generally,

ff <- nobars(formula(version_3_test))
X <- model.matrix(ff, data = df_finaldata)
caret::findLinearCombos(X)

should diagnose exactly which set(s) of columns is/are multicollinear.

Upvotes: 0

Related Questions