Reputation: 1
I am trying to run a HLM analysis in R that predicts "ranking" with the rest of the columns as predictors - I want to include interaction effects but I keep getting the warning message "fit warnings: fixed-effect model matrix is rank deficient so dropping 2 columns / coefficients". There is no linear dependence between the predictors so I am wondering whether this is because all of the four predictors used for the interaction terms are binary coded. I already tried to solve the problem with factor() but it doesn't change anything. The code is the following:
version_3_test <- lmer(ranking ~ 1 + ac_fc + npdc_fc +
ed_fc + iks_fc + ac_fc:ed_fc + ac_fc:iks_fc +
npdc_fc:ed_fc + npdc_fc:iks_fc + bu_ed_fc +
nat_ed_fc + eng_ed_fc + it_ind_fc + ph_ind_fc +
eng_ind_fc + work + age +
(1 + ac_fc + npdc_fc + ed_fc + iks_fc| id_ind),
data = df_finaldata, REML = FALSE)
summary(version_3_test)
ac_fc, npdc_fc, iks_fc, ed_fc are factor-variables - each with two levels.
Does anyone know how to solve this problem or what else could be the reason for the warning?
I tried interaction terms with other predictors that are not binary coded and this worked.
Upvotes: 0
Views: 566
Reputation: 226741
There are almost certainly multicollinear columns in your model matrix.
If the four variables ac_fc
, npdc_fc
, ed_fc
, iks_fc
represent indicator variables for an exhaustive and mutually exclusive set of categories (i.e. if every observation comes from the set {ac, npdc, ed, iks}
), then it's natural that these columns would be multicollinear (because their sum is always 1).
Another possibility is that you have interaction terms in your models where some combinations of levels are missing from the data (e.g. A = {1, 2}
, B = {1,2}
, you have A*B
or A:B
in your model, but A=2; B=2
never occurs in your data).
More generally,
ff <- nobars(formula(version_3_test))
X <- model.matrix(ff, data = df_finaldata)
caret::findLinearCombos(X)
should diagnose exactly which set(s) of columns is/are multicollinear.
Upvotes: 0