Reputation: 5
I am estimating lmer models with subject random effects for a within-subject design study. I have a measurement of a dependent variable for each subject in three different treatment conditions, resulting in a balanced design. In addition to the treatment dummies I also have control variables in the lmer model.
First thing that stuck out is that all treatment dummies had equal standard errors, which has already been asked and answered here:
The second thing that stuck out was that the coefficients of the treatment dummies do not change if I add control variables to the model.
Here the behavior of lmer is reproduced with some simulated data:
library(tidyverse)
library(lme4)
library(lmerTest)
#Some data:
id <- rep(1:50) #subject id
dependent_1 <- rnorm(50,10,5) #dependent measure in treatment 1
dependent_2 <- rnorm(50,18,3) #dependent measure in treatment 2
dependent_3 <- rnorm(50,28,4) #dependent measure in treatment 3
control_a <- rnorm(50, 100, 5) #first control
control_b <- rnorm(50, 200,33) #second control
df <- data.frame(id, dependent_1, dependent_2, dependent_3, control_a, control_b) #make dataframe
#Reshape to long form
df_long <- pivot_longer(df,
cols = starts_with("dependent_"),
names_to = c(".value","treatment"),
names_sep = "\\_")
#Treatment to factor
df_long$treatment <- as.factor(df_long$treatment)
#LMER Models
lmer_model.1 <- lmer(dependent ~ treatment +(1|id), data = df_long, REML = FALSE) #Model with treatment dummies only
lmer_model.2 <- lmer(dependent ~ treatment + control_a + control_b + (1|id), data = df_long, REML = FALSE) #Model with treatment dummies and controls
I get the following results:
===============================================================
Model 1 Model 2
---------------------------------------------------------------
(Intercept) 9.246 (0.567) *** 17.535 (7.796) *
treatment2 8.157 (0.787) *** 8.157 (0.787) ***
treatment3 20.030 (0.787) *** 20.030 (0.787) ***
control_a -0.067 (0.072)
control_b -0.008 (0.011)
---------------------------------------------------------------
AIC 852.194 854.977
BIC 867.247 876.051
Log Likelihood -421.097 -420.488
Num. obs. 150 150
Num. groups: id 50 50
Var: id (Intercept) 0.596 0.457
Var: Residual 15.492 15.492
===============================================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Can anyone explain to me the reason why this happens?
Upvotes: 0
Views: 363
Reputation: 690
This is apparently about statistics, not programming. Consider asking it on Cross Validated instead.
It seems that the answer to your question lies in the way you set up your example data. An additional control/predictor variable X2 only influences estimates of another, previously included predictor X1 they are (at least a bit) correlated. In reality, this is mostly true, as you hardly get r = .00 correlations in real-life data. But you set up your data in a way such that treatment
is orthogonal to control_a
and control_b
. Therefore, including either control does not impact the coefficients of the treatment dummies.
Upvotes: 0