Reputation: 55
I am running a regression with clustered standard errors by year. This is easy to do with Stata but I have to do it with R, so I run it using the lm_robust()
function from the estimatr
package. The problem is that I must now get the marginal effects of some variables but I cannot do it and I guess it is because of the cluster standard error. I followed what is on the manual for lm_robust()
and I've seen they only used the margins command from the margins package for other functions without clustered standard errors... Does anyone have a clue on how can I get and plot the marginal effects?
set.seed(42)
library(fabricatr)
library(randomizr)
dat <- fabricate(
N = 100, # sample size
x = runif(N, 0, 1), # pre-treatment covariate
y0 = rnorm(N, mean = x), # control potential outcome
y1 = y0 + 0.35, # treatment potential outcome
z = complete_ra(N), # complete random assignment to treatment
y = ifelse(z, y1, y0), # observed outcome
# We will also consider clustered data
clust = sample(rep(letters[1:20], each = 5)),
z_clust = cluster_ra(clust),
y_clust = ifelse(z_clust, y1, y0)
)
Then when I run the regression with the lm_robust()
function:
library(estimatr)
lmout_cl <- lm_robust(
y_clust ~ z_clust + x,
data = dat,
clusters = clust
)
And finally, I try to get the margins...
library(margins)
mar_cl <- margins(lmout_cl)
But this results in an error:
Error in attributes(.Data) <- c(attributes(.Data), attrib) :'names' attribute
[1] must be the same length as the vector [0]
Upvotes: 5
Views: 2863
Reputation: 475
Apologies for this bug which prevents margins()
from working with lm_robust()
objects with non-numeric clusters in estimatr
versions 0.10 and earlier. This was created by the internal way both estimatr::lm_robust()
and margins::margins()
handle which variables are in the model.
The bug has since been solved and so you have two solutions within estimatr
.
Let me first generate the data.
library(fabricatr)
library(randomizr)
dat <- fabricate(
N = 100,
x = runif(N),
clust = sample(rep(letters[1:20], each = 5)),
y_clust = rnorm(N),
z_clust = cluster_ra(clust),
)
Get the latest version of estimatr
(v0.11.0)
The dev version on https://declaredesign.org/r/estimatr has a fix for this bug, and it will be on CRAN in the next month or so.
install.packages("estimatr", dependencies = TRUE,
repos = c("http://r.declaredesign.org", "https://cloud.r-project.org"))
library(estimatr)
lmout_cl <- lm_robust(
y_clust ~ z_clust + x,
data = dat,
clusters = clust
)
library(margins)
mar_cl <- margins(lmout_cl)
Use numeric clusters with CRAN version of estimatr
(v0.10.0)
A workaround with the existing version of estimatr
on CRAN is to use numeric clusters instead of character clusters
dat <- fabricate(
N = 100,
x = runif(N),
clust = sample(rep(1:20, each = 5)),
y_clust = rnorm(N),
z_clust = cluster_ra(clust),
)
install.packages("estimatr")
library(estimatr)
lmout_cl <- lm_robust(
y_clust ~ z_clust + x,
data = dat,
clusters = clust
)
mar_cl <- margins(lmout_cl)
Upvotes: 5
Reputation: 72613
The problem is that estimatr::lm_robust()
yields a "lm_robust"
object which seems not to be supported by margins()
at the moment. We can use miceadds::lm.cluster()
instead—and obtain the same clustered standard errors as Stata at that.
library(miceadds)
lmout_cl <- lm.cluster(y_clust ~ z_clust + x, data=dat, cluster=dat$clust)
This results in a list with two elements, where the normal lm
-object is stored in the first element and the variance-covariance matrix with clustered standard errors the second (see str(lmout_cl)
):
> names(lmout_cl)
[1] "lm_res" "vcov"
margins()
now can be specified as margins(model=model, vcov=vcov)
, so we say:
mar_cl <- with(lmout_cl, margins(lm_res, vcov=vcov))
Yielding
> mar_cl
Average marginal effects
stats::lm(formula = formula, data = data)
z_clust x
0.6558 1.444
and
> summary(mar_cl)
factor AME SE z p lower upper
x 1.4445 0.3547 4.0728 0.0000 0.7494 2.1396
z_clust 0.6558 0.1950 3.3633 0.0008 0.2736 1.0379
with clustered standard errors.
Comparison with Stata
R
foreign::write.dta(dat, "dat.dta") # export as Stata data to wd
Stata
. use dat, clear
(Written by R. )
. quietly regress y_clust z_clust x, vce(cluster clust)
. mfx
Marginal effects after regress
y = Fitted values (predict)
= .67420391
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
z_clust*| .6557558 .19498 3.36 0.001 .273609 1.0379 .5
x | 1.444481 .35466 4.07 0.000 .749352 2.13961 .524479
------------------------------------------------------------------------------
(*) dy/dx is for discrete change of dummy variable from 0 to 1
.
As we can clearly see—in doing so R yields the same as Stata concerning both, clustered standard errors and marginal effects.
Upvotes: 3