Paul
Paul

Reputation: 115

Does emmeans use lm_robust cluster-robust standard errors when calculating confidence intervals for transformed outcome variables?

I'm examining interactions between two continuous predictor variables using the emmeans package. I'm using lm_robust() from the estimatr package to perform the linear regression and obtain cluster-robust standard errors. The outcome variable is centered and scaled to SD unit variance. For example:

fit <- lm_robust(scale(Y) ~ X1 * X2 + X3 + X4, data = mydata, cluster = school, se_type = 'CR2')

I can then perform pairwise contrasts or visualise the lines at three levels of X2 using code similar to:

emmip(fit, X2 ~ X1, CIs = TRUE, at = list(X2 = c(mean(X2) - sd(X2),
                                                 mean(X2),
                                                 mean(X2) + sd(X2))))

I do not wish to back transform the outcome variable to its original scale.

My question is whether emmeans uses the cluster-robust standard errors to calculate the confidence intervals or the p-values that it reports and does this behaviour depend on whether the outcome variable is on its original scale or transformed? A short example on the estimatr package creators' website suggests that lm_robust objects can be used with emmeans, but I can't see lm_robust listed as a supported model on the "Models supported by emmeans" vignette page or the package documentation.

Upvotes: 3

Views: 1163

Answers (1)

Russ Lenth
Russ Lenth

Reputation: 6770

I believe thar lm_robust objects are an extension of lm, so it uses the emmeans support for lm. In turn, that would imply that the estimates are ovtained via coef(model) and their SEs are derived using vcov(model). So if vcov() returns the robust variances you need, emmeans will use them.

Regarding most transformations, it will work as described in the vignette on transformations. In particular, specifying type = "response" results in the estimates and confidence limits to be back-transformed, the P values to be left alone, and the SEs to be computed by the delta method (but not used in the CIs and tests).

Additional information

First, I found out that lm_robust does not inherit from lm; rather, the estimatr package incorporates its own support for emmeans. Not many details are given, but that developer of estimatr must believe that what is provided must be appropriate.

The scale() transformation is not built-in, because it is complicated. Just saying we used "scale" isn't as simple as saying it is "log", say, because to work with scale() results, we need to know what was used to center and divide the results.

The workaround is to create the object that emmeans() and its relatives need to invert the transformation; and that is a list of functions of the form returned by stats::make.link() or emmeans::make.tran(). Here is a function that will serve that purpose:

make.scaletran = function(y, ...) {
    sy = scale(y, ...)
    if(is.null(m <- attr(sy, "scaled:center")))
       m = 0
    if(is.null(s <- attr(sy, "scaled:scale")))
        s = 1
    list(
        linkfun = function(mu) (mu - m) / s,
        linkinv = function(eta) s * eta + m,
        mu.eta = function(eta) s,
        valideta = function(eta) TRUE,
        name = paste0("scale(", signif(m, 3), ", ", signif(s, 3), ")")
    )
}

To use this, you need to manually specify the transformation since it isn't auto-detected. Here is an example using the warpbreaks data already available in R:

> warp.lmr = lm_robust(scale(breaks) ~ tension, cluster = wool, 
+     se_type = 'CR2', data = warpbreaks)

> tran = make.scaletran(warpbreaks$breaks)

> emmeans(warp.lmr, "tension", tran = tran)
 tension emmean    SE df lower.CL upper.CL
 L        0.624 0.619 51   -0.618   1.8666
 M       -0.133 0.181 51   -0.497   0.2301
 H       -0.491 0.219 51   -0.930  -0.0517

Results are given on the scale(28.1, 13.2) (not the response) scale. 
Confidence level used: 0.95 

> emmeans(warp.lmr, "tension", tran = tran, type = "response")
 tension response   SE df lower.CL upper.CL
 L           36.4 8.17 51     20.0     52.8
 M           26.4 2.39 51     21.6     31.2
 H           21.7 2.89 51     15.9     27.5

Confidence level used: 0.95 
Intervals are back-transformed from the scale(28.1, 13.2) scale 

The code in the OP for the emmip() call is not correct, as it uses the specs for emmeans(), not emmip().

I will consider adding this scale transformation option to emmeans::make.tran() in a future update.

Upvotes: 7

Related Questions