T Meurs
T Meurs

Reputation: 11

Multicollinearity in Linear Mixed model

I am fitting the following lmer models in R:

lmer(rt~deadline*cond+age+(1+deadline|task/pp), REML=FALSE) ##Model 1
lmer(rt~deadline+cond+age+(1+deadline|task/pp), REML=FALSE) ##Model 2

Where rt is reaction time, deadline has 2 levels short or long, cond also has two levels: easy or hard. In the research I have conducted, 30 subjects have done 4 tasks. Per task subjects did 50 trials in each of the four levels (short/easy, long/easy, short/hard, long/hard). In my model above, I have random intercepts for person and task, and random slopes for person and task on deadline. So in total they have done 800 trials. Accuracy and reaction time were recorded. I am interested if complexity (=cond) and urgency (=deadline) have a (interaction) effect on reaction time. Since there might be a interaction effect, I fitted the first model with the interaction, and I compare it with a second model without the interaction.

When I run the first model, I get the following output:

lm.rtfnew <- lmer(rt~deadline*cond+age+(1+deadline|task/pp), REML=FALSE)
fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
summary(lm.rtfnew)
Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: rt ~ deadline * cond + age + (1 + deadline | task/pp)
 ...
                           Estimate Std. Error t value
   (Intercept)             5.874631   0.669971    8.77
 deadlineshort            -0.375643   0.171779   -2.19
      condhard            -4.685013   0.066538  -70.41
      condeasy            -4.658016   0.066538  -70.01
           age             0.006791   0.018018    0.38
   deadlineshort:condhard  0.007752   0.018960    0.41
 ...
 fit warnings:
     fixed-effect model matrix is rank deficient so dropping 1 column / coefficient 
So I have a problem, since for both deadline and cond, two columns are made, and then these two columns show perfect multicollinearity? Then R fixes it by dropping one column for deadline, but not for cond (since there is a fixed effect condhard and condeasy). Therefore I have two questions:

  1. Why does R doesn't drop a column of cond?
  2. Do I need to fix it manually?

Upvotes: 1

Views: 961

Answers (1)

Roland
Roland

Reputation: 132969

With additional information provided by OP in the comments one problem became apparent: There is a factor level "" in cond. This level is represented by the intercept and "hard" and "easy" rts are quite different from "" rts, but very similar to each other in comparison. These "" values are actually NA values and should be encoded as such. If this is done corresponding observations would be removed by na.action = na.omit. Doing this might already fix the rank-deficiency problem. If it doesn't, OP could consider scaling variables or dropping the interaction term.

Upvotes: 0

Related Questions