Reputation: 477
I am building a linear model in R and I have specific planned contrasts I am trying to implement. The model involves an interaction term. I believe I know how to correctly specify the matrix that glht()
takes in order to provide estimates for the contrasts. However, i'm unsure how to specify interaction planned contrasts in the matrix. Here is a reproducible example:
library(multcomp)
# Create minimal dataset
set.seed(100)
df <- data.frame(response = rnorm(2000),
age = sample(15:60, 2000, replace = TRUE),
sex = rep(c("Male", "Female"), each = 1000),
ses = rep(c("L", "LM", "MH", "H"), 500))
# Basic linear model
m1 <- lm(response ~ age + sex * ses,
data = df)
summary(m1)
Call:
lm(formula = response ~ age + sex * ses, data = df)
Residuals:
Min 1Q Median 3Q Max
-3.3423 -0.6697 0.0044 0.6716 3.2542
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0514746 0.0889766 -0.579 0.563
age -0.0003435 0.0016934 -0.203 0.839
sexMale 0.0819362 0.0900868 0.910 0.363
sesL 0.0769168 0.0901245 0.853 0.394
sesLM 0.1289027 0.0901339 1.430 0.153
sesMH 0.0681291 0.0900821 0.756 0.450
sexMale:sesL -0.0661439 0.1274402 -0.519 0.604
sexMale:sesLM -0.0925924 0.1275157 -0.726 0.468
sexMale:sesMH -0.1188700 0.1273978 -0.933 0.351
Residual standard error: 1.007 on 1991 degrees of freedom
Multiple R-squared: 0.001591, Adjusted R-squared: -0.002421
F-statistic: 0.3966 on 8 and 1991 DF, p-value: 0.9229
# Create a matrix of linear combinations
# Here 'H' and 'Female' are the reference categories,
lc <- rbind("Female:H" = c(1, 0, 0, 0, 0, 0, 0, 0, 0),
"Female:L" = c(1, 0, 0, 1, 0, 0, 0, 0, 0),
"Female:LM" = c(1, 0, 0, 0, 1, 0, 0, 0, 0),
"Female:MH" = c(1, 0, 0, 0, 0, 1, 0, 0, 0),
"Male vs Female difference in L vs H" = c(1, 0, 0, 0, 0, 0, 1, 0, 0),
"Male vs Female difference in LM vs H" = c(1, 0, 0, 0, 0, 0, 0, 1, 0),
"Male vs Female difference in MH vs H" = c(1, 0, 0, 0, 0, 0, 0, 0, 1))
# Create matrix for the linfct argument in glht()
# One contrast compares the LM to the L category in females
# Another contrast compares the pooled LM and L categories to the H category in females
k <- rbind("Females: LM vs. L" =
lc["Female:LM", ] - lc["Female:L", ],
"Females: Pooled L and LM vs H" =
((lc["Female:LM", ] + lc["Female:L", ]) / 2) - lc["Female:H", ])
c1 <- glht(m1, linfct = k)
s1 <- summary(c1, adjusted(type = "none"))
confint(c1, adjusted(type = "none"))
I would like to add to k
, contrasts:
Edit
For example, in the summary output above you would interpret the sexMale:sesL
term as: The additional difference between L and H, for males vs. females.
This post in CrossValidated https://stats.stackexchange.com/questions/112944/design-of-matrix-of-contrasts-in-r was instrumental in getting me this far. However, i'm not sure what linear combinations from lc
above I should be subtracting in k
to get the coefficients I want.
Help is greatly appreciated.
Upvotes: 3
Views: 1174
Reputation: 6796
Unfortunately, I don't think your lc
and k
are correct. The matrix's col-number corresponds to names(m1$coef)
's elments number.
[1] Intercept, [2] age, [3] sexMale, [4] sesL, [5] sesLM, [6] sesMH, [7] sexMale:sesL, [8] sexMale:sesLM, [9] sexMale:sesMH
.
So, "Female:L" = c(1, 0, 1, 0, 0, 0, 0, 0, 0)
means "Male:H"
, and "Female:LM" = c(1, 0, 0, 1, 0, 0, 0, 0, 0)
means Female:L
. You used these difference as k[,1]
. Attention to s1
's Estimate, please.
s1
# Linear Hypotheses:
# Estimate Std. Error t value Pr(>|t|)
# Females: LM vs. L == 0 -0.005019 0.090091 -0.056 0.956
# Females: Pooled L and LM vs H == 0 0.079427 0.078038 1.018 0.309
# In fact, you did "sexMale:H vs Female:L".
m1$coef["sesL"] - m1$coef["sexMale"] # -0.005019421
You can get correct matrix by the method the post you linked showed.
group <- paste0(df$sex, df$ses)
group <- aggregate(model.matrix(m1) ~ group, FUN=mean)
rownames(group) <- group$group
group <- group[,-1]
group$age <- 0
lc2 <- as.matrix(group)
rownames(lc2) <- c("Female:H", "Female:L", "Female:LM", "Female:MH",
"Male:H", "Male:L", "Male:LM", "Male:MH")
lc2
(Intercept) age sexMale sesL sesLM sesMH sexMale:sesL sexMale:sesLM sexMale:sesMH
Female:H 1 0 0 0 0 0 0 0 0
Female:L 1 0 0 1 0 0 0 0 0
Female:LM 1 0 0 0 1 0 0 0 0
Female:MH 1 0 0 0 0 1 0 0 0
Male:H 1 0 1 0 0 0 0 0 0
Male:L 1 0 1 1 0 0 1 0 0
Male:LM 1 0 1 0 1 0 0 1 0
Male:MH 1 0 1 0 0 1 0 0 1
k2 <- rbind("Females: LM vs. L" =
lc2["Female:LM", ] - lc2["Female:L", ],
"Females: Pooled L and LM vs H" =
((lc2["Female:LM", ] + lc2["Female:L", ]) / 2) - lc2["Female:H", ])
c1_2 <- glht(m1, linfct = k2)
s1_2 <- summary(c1_2, adjusted(type = "none"))
s1_2
# Linear Hypotheses:
# Estimate Std. Error t value Pr(>|t|)
# Females: LM vs. L == 0 0.05199 0.09008 0.577 0.564
# Females: Pooled L and LM vs H == 0 0.10291 0.07807 1.318 0.188
m1$coef["sesLM"] - m1$coef["sesL"] # 0.05198589
(m1$coef["sesL"] + m1$coef["sesLM"])/2 # 0.1029097
The identity of sexMale
is the interaction term Male : H
just to be sure.
m2 <- lm(response ~ age + ses + sex:ses , data = df)
summary(m2) # excerpt
# sesH:sexMale 0.0819362 0.0900868 0.910 0.363
summary(m1) # excerpt
# sexMale 0.0819362 0.0900868 0.910 0.363
(Note: difference doesn't have an intercept because it is a common term)
k3 <- rbind(
"Male vs Female difference in L vs H" =
(lc2["Male:L",] - lc2["Male:H",]) - (lc2["Female:L",] - lc2["Female:H",]),
"Male vs Female difference in LM vs H" =
(lc2["Male:LM",] - lc2["Male:H",]) - (lc2["Female:LM",] - lc2["Female:H",]),
"Male vs Female difference in MH vs H" =
(lc2["Male:MH",] - lc2["Male:H",]) - (lc2["Female:MH",] - lc2["Female:H",]),
# above is modified lc[5:7, ], below is what you want
"Male vs Female difference in LM vs L" =
(lc2["Male:LM",] - lc2["Male:L",]) - (lc2["Female:LM",] - lc2["Female:L",]),
"Male vs Female difference in Pooled L and LM vs H" =
((lc2["Male:L", ] + lc2["Male:LM", ]) / 2 - lc2["Male:H", ])
- ((lc2["Female:L", ] + lc2["Female:LM", ]) / 2 - lc2["Female:H", ])
)
k3 # I deleted some character, space, digit because of the space and viewability
(Intercept) age sexM sesL sesLM sesMH sexM:sesL sexM:sesLM sexM:sesMH
M vs F diff in L vs H 0 0 0 0 0 0 1 0 0
M vs F diff in LM vs H 0 0 0 0 0 0 0 1 0
M vs F diff in MH vs H 0 0 0 0 0 0 0 0 1
M vs F diff in LM vs L 0 0 0 0 0 0 -1 1 0
MvsF diff in L&LM vs H 0 0 0 0 0 0 0.5 0.5 0
Upvotes: 2