Interpreting 95% CI for model estimated difference between two level factor in linear mixed-effects model

Question

This is my data frame (please copy and paste to reproduce):

Control <- replicate(2, c("112", "113", "116", "118", "127", "131", "134", "135", "136", "138", "143", "148", "149", "152", "153", "155", "162", "163"))
EPD <- replicate(2, c("101", "102", "103", "104", "105", "106", "107", "108", "109", "110", "114", "115", "117", "119", "120", "122", "124", "125", "126", "128", "130", "133", "137", "139", "140", "141", "142", "144", "145", "147"))
Subject <- c(Control, EPD)
Control_FA_L <- c(0.43, 0.39, 0.38, 0.58, 0.37, 0.5, 0.35, 0.36, 0.72, 0.38, 0.45, 0.30, 0.47, 0.30, 0.67, 0.34, 0.42, 0.29)
Control_FA_R <- c(0.36, 0.49, 0.55, 0.59, 0.33, 0.41, 0.32, 0.50, 0.59, 0.52, 0.32, 0.40, 0.49, 0.33, 0.46, 0.39, 0.37, 0.33)
EPD_FA_L <- c(0.25, 0.39, 0.36, 0.42, 0.21, 0.40, 0.43, 0.16, 0.31, 0.41, 0.39, 0.40, 0.35, 0.29, 0.31, 0.24, 0.39, 0.36, 0.54, 0.38, 0.34, 0.28, 0.42, 0.33, 0.40, 0.36, 0.42, 0.28, 0.40, 0.41)
EPD_FA_R <- c(0.26, 0.36, 0.36, 0.61, 0.22, 0.33, 0.36, 0.34, 0.35, 0.37, 0.39, 0.45, 0.30, 0.31, 0.50, 0.31, 0.29, 0.43, 0.41, 0.21, 0.38, 0.28, 0.66, 0.33, 0.50, 0.27, 0.46, 0.37, 0.26, 0.39)
FA <- c(Control_FA_L, Control_FA_R, EPD_FA_L, EPD_FA_R)
Control_Volume_L <- c(99, 119, 119, 146, 127, 96, 100, 132, 103, 103, 107, 142, 140, 134, 117, 117, 133, 143)
Control_Volume_R <- c(93, 123, 114, 152, 122, 105, 98, 138, 111, 110, 115, 137, 142, 140, 124, 102, 153, 143)
EPD_Volume_L <- c(132, 115, 140, 102, 130, 131, 110, 124, 102, 111, 93, 92, 94, 104, 92, 115, 144, 118, 104, 132, 90, 102, 94, 112, 106, 105, 79, 114, 104, 108)
EPD_Volume_R <- c(136, 116, 143, 105, 136, 137, 103, 121, 105, 115, 97, 97, 93, 108, 91, 117, 147, 111, 97, 129, 85, 107, 91, 116, 113, 101, 75, 108, 95, 98)
Volume <- c(Control_Volume_L, Control_Volume_R, EPD_Volume_L, EPD_Volume_R)
Group <- c(replicate(36, "Control"), replicate(60, "Patient"))

data <- data.frame(Subject, FA, Volume, Group)

I then run a linear mixed model for FA values with the nlme package:

library(nlme)
lmm <- lme(FA ~ Volume + Group, ~ 1|Subject, data = data)
summary(lmm)

I would now like to determine the 95% Confidence Interval for the model estimated difference in FA between the two levels of the "Group" factor (Control & Patient). I would normally proceed by executing the following code:

# Compute 95% Confidence Interval for Group factor

# True difference in STN FA between Control and EPD subjects
0.0857851 # Value from mixed model

# Multiply 97.5 percentile point of normal distribution by std error from mixed model
1.96 * 0.02555076 # 95% CI:  0.086 ± 0.050 mm^3 (p = .0016) - !!CI includes values > 1!!

I am having a hard time interpreting what this means. The confidence interval I have calculated includes values greater than 1 which doesn't make sense since FA is supposed to be a ratio value from 0 to 1. Is the fact that my dependent variable is a ratio value the issue? If so would I need to transform my data in some way (i.e. log transform) to correct this? Any feedback would be greatly appreciated!

Maurits Evers · Accepted Answer

As pointed out by @42-, the issue here is with the model itself. Since FD is restricted to [0, 1] we can't use lme which assumes normal errors.

Model definition

I don't know any details about your data/experiment but perhaps a Beta model may work; specifically we could use a varying-intercept hierarchical model of the form

where we connect the mean of FD μ to a Subject-specific intercept and the predictors Volume and Group through a logit-link.

Implementation

The library glmmTMB allows to implement such a mixed-effect model

library(glmmTMB)
lmm <- glmmTMB(
    FA ~ Volume + Group + (1 | Subject),
    data = data,
    family = "beta_family")
summary(lmm)$coef$cond
#                 Estimate  Std. Error   z value     Pr(>|z|)
#(Intercept)   0.502858259 0.348506927  1.442893 0.1490505719
#Volume       -0.006464251 0.002782781 -2.322947 0.0201820253
#GroupPatient -0.369273205 0.104832100 -3.522520 0.0004274642

Some comments on the estimates

Note that estimates are given on the logit (log odds) scale; the estimate for Group = Control is then 0.503 - 0.369 * 0 = 0.503, and for Group = Patient it is 0.503 - 0.369 * 1 = 0.134. The difference between Group = Patient and Group = Control (again on the logit scale) is simply the coefficient for GroupPatient which is -0.369.

Comparison of marginal means

I then recommend using emmeans for any follow-up analyses; in this case we can use emmeans::pairs to compare the estimated marginal means (EMMs) for the two Group levels

library(emmeans)
confint(pairs(emmeans(lmm, "Group")))
# contrast           estimate        SE df  lower.CL  upper.CL
# Control - Patient 0.3692732 0.1048321 91 0.1610371 0.5775093
# 
#Results are given on the log odds ratio (not the response) scale.
#Confidence level used: 0.95

Note that results are given on the logit scale (and not on the response scale). To get a ratio of FD responses for Group = Patient and Group = Control you need to manually convert these estimates.

Explanation: Here emmeans returns the EMMs for Group, and pairs performs a pairwise comparison of the different levels of Group. We then use confint to return (default 95%) confidence intervals.

The nice thing is that you wouldn't have to change anything if you have >2 levels for Group; pairs would perform pairwise comparisons, and automatically correct p-values for multiple hypothesis testing.

For more reading, take a look at the excellent vignette Comparisons and contrasts in emmeans.

You can also get estimated marginal means and confidence intervals on the odds ratio scale (which avoids having to manually transform from the logit scale to the odds ratio scale)

confint(pairs(emmeans(lmm, "Group"), type = "response"))
# contrast          odds.ratio        SE df lower.CL upper.CL
# Control / Patient   1.446683 0.1516588 91 1.174729 1.781595

Confidence level used: 0.95
Intervals are back-transformed from the log odds ratio scale

Interpreting 95% CI for model estimated difference between two level factor in linear mixed-effects model

Answers (1)

Model definition

Implementation

Comparison of marginal means

Related Questions