Reputation: 115
I'm examining interactions between two continuous predictor variables using the emmeans
package. I'm using lm_robust()
from the estimatr
package to perform the linear regression and obtain cluster-robust standard errors. The outcome variable is centered and scaled to SD unit variance. For example:
fit <- lm_robust(scale(Y) ~ X1 * X2 + X3 + X4, data = mydata, cluster = school, se_type = 'CR2')
I can then perform pairwise contrasts or visualise the lines at three levels of X2
using code similar to:
emmip(fit, X2 ~ X1, CIs = TRUE, at = list(X2 = c(mean(X2) - sd(X2),
mean(X2),
mean(X2) + sd(X2))))
I do not wish to back transform the outcome variable to its original scale.
My question is whether emmeans
uses the cluster-robust standard errors to calculate the confidence intervals or the p-values that it reports and does this behaviour depend on whether the outcome variable is on its original scale or transformed? A short example on the estimatr
package creators' website suggests that lm_robust
objects can be used with emmeans
, but I can't see lm_robust
listed as a supported model on the "Models supported by emmeans" vignette page or the package documentation.
Upvotes: 3
Views: 1163
Reputation: 6770
I believe thar lm_robust objects are an extension of lm, so it uses the emmeans support for lm. In turn, that would imply that the estimates are ovtained via coef(model) and their SEs are derived using vcov(model). So if vcov() returns the robust variances you need, emmeans will use them.
Regarding most transformations, it will work as described in the vignette on transformations. In particular, specifying type = "response" results in the estimates and confidence limits to be back-transformed, the P values to be left alone, and the SEs to be computed by the delta method (but not used in the CIs and tests).
First, I found out that lm_robust
does not inherit from lm
; rather, the estimatr package incorporates its own support for emmeans. Not many details are given, but that developer of estimatr must believe that what is provided must be appropriate.
The scale()
transformation is not built-in, because it is complicated. Just saying we used "scale"
isn't as simple as saying it is "log"
, say, because to work with scale()
results, we need to know what was used to center and divide the results.
The workaround is to create the object that emmeans()
and its relatives need to invert the transformation; and that is a list of functions of the form returned by stats::make.link()
or emmeans::make.tran()
. Here is a function that will serve that purpose:
make.scaletran = function(y, ...) {
sy = scale(y, ...)
if(is.null(m <- attr(sy, "scaled:center")))
m = 0
if(is.null(s <- attr(sy, "scaled:scale")))
s = 1
list(
linkfun = function(mu) (mu - m) / s,
linkinv = function(eta) s * eta + m,
mu.eta = function(eta) s,
valideta = function(eta) TRUE,
name = paste0("scale(", signif(m, 3), ", ", signif(s, 3), ")")
)
}
To use this, you need to manually specify the transformation since it isn't auto-detected. Here is an example using the warpbreaks
data already available in R:
> warp.lmr = lm_robust(scale(breaks) ~ tension, cluster = wool,
+ se_type = 'CR2', data = warpbreaks)
> tran = make.scaletran(warpbreaks$breaks)
> emmeans(warp.lmr, "tension", tran = tran)
tension emmean SE df lower.CL upper.CL
L 0.624 0.619 51 -0.618 1.8666
M -0.133 0.181 51 -0.497 0.2301
H -0.491 0.219 51 -0.930 -0.0517
Results are given on the scale(28.1, 13.2) (not the response) scale.
Confidence level used: 0.95
> emmeans(warp.lmr, "tension", tran = tran, type = "response")
tension response SE df lower.CL upper.CL
L 36.4 8.17 51 20.0 52.8
M 26.4 2.39 51 21.6 31.2
H 21.7 2.89 51 15.9 27.5
Confidence level used: 0.95
Intervals are back-transformed from the scale(28.1, 13.2) scale
The code in the OP for the emmip()
call is not correct, as it uses the specs for emmeans()
, not emmip()
.
I will consider adding this scale transformation option to emmeans::make.tran()
in a future update.
Upvotes: 7