melbez
melbez

Reputation: 1000

Running several regressions with clustering in R

I typically use the following code when I need to run several regressions.

outcomes <- colnames(df[,1:10]) #specifies column names for variables
form <- paste(outcomes, "~ covariate1 + covariate2)
model <- form %>%
  set_names(outcomes) %>%
  map(~lm(.x, data = df))
map(model, summary)

This then gives me the output for the regression of all outcome variables on covariate1 + covariate2.

I am trying to do the same thing but with clustered robust standard errors. I have used lm_robust from the estimatr package. This is the modification I have made to the above code.

outcomes <- colnames(df[,1:10]) 
form <- paste(outcomes, "~ covariate1 + covariate2)
model <- form %>%
  set_names(outcomes) %>%
  map(~lm_robust(.x, data = df, clusters = id))
map(model, summary)

As you can see, I have changed lm to lm_robust and have added an argument that specifies the level at which I want to cluster. Why doesn't this work when the above code does work? What would you suggest modifying to make this code run?

I am also open to completely new ways of running clustered and non-clustered regressions simultaneously.

Upvotes: 1

Views: 535

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

The main difference between lm and lm_robust is in the first argument, lm accepts a formula object or an object that is coercible to one.

library(estimatr)
library(purrr)

So passing formula object as string works with lm

lm("mpg~gear+carb", data = mtcars)

which is same as

lm(mpg~gear+carb, data = mtcars)

but this does not work with lm_robust

lm_robust("mpg~gear+carb", data = mtcars)

Error in formula[[2]] : subscript out of bounds

It needs a formula object only.

lm_robust(mpg~gear+carb, data = mtcars)
#            Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper DF
#(Intercept)     7.28      2.984    2.44 2.11e-02     1.17    13.38 29
#gear            5.58      0.933    5.98 1.69e-06     3.67     7.48 29
#carb           -2.75      0.366   -7.53 2.64e-08    -3.50    -2.01 29

So change the string to formula object in your code and it should work.

model <- form %>%
         set_names(outcomes) %>%
         map(~lm_robust(as.formula(.x), data = df, clusters = id))

Upvotes: 4

Related Questions