rt.l
rt.l

Reputation: 306

r sandwich standard errors with felm (lfe) regression: dynamically defining the formula

I using the felm funcion of the lfe package to run a regression with many fixed effects, and I need to use the sandwich package to estimate the standard errors. The formula used in the regression is built dynamically, and I save it in a variable "form". When I call the sandwich functions, it does not understand the model with the formula saved in a variable.

Here is a simple example:

# gen process
set.seed(42)
nn = 10
n1 = 3

x <- rnorm(nn)
f1 <- sample(n1, length(x), replace=TRUE)
y <- 2.13*x + cos(f1) + rnorm(length(x), sd=0.5)

# sandwich working
est <- lfe::felm(y ~ x | f1)
summary(est)
sandwich::vcovPL(est)

# sandwich not working
form <- as.formula("y ~ x | f1")
est <- lfe::felm(form)
summary(est)
sandwich::vcovPL(est)

Eventhough the results of the regressions are the same, in the second case I can not use the sandwich function, and this last line gives an error that reads: Error: object of type 'symbol' is not subsettable

Any clue on how I can fix this?

Thanks a lot,

Upvotes: 1

Views: 632

Answers (1)

Laurent Berg&#233;
Laurent Berg&#233;

Reputation: 1372

Update

On exactly why the second version of the call doesn't work: it has to do with model.matrix which doesn't work (I've no solution sorry). Be careful though with the compatibility between lfe and sandwich, as highlighted in my first answer below.


There are two things. First, (I think but to be confirmed) felm objects seem not directly compatible with sandwich variances, leading to erroneous results. Second, the are many details involved in computing the standard-errors, notably the decision regarding the degrees of freedom to consider -- this is the main cause of differences across software.

Here is an example of how to (almost) obtain the felm VCOV with sandwich:

base = iris
names(base) = c("y", "x1", "x2", "x3", "species")
 
library(lfe) ; library(sandwich)
 
est_felm = felm(y ~ x1 + x2 + x3 | species | 0 | species, base)

# vcovCL does not work appropriately when applied to felm objects
vcov(est_felm) / vcovCL(est_felm, cluster = base$species)
#>           x1        x2        x3
#> x1 0.1126600 0.1011106 0.7028052
#> x2 0.1011106 0.1074147 0.1702584
#> x3 0.7028052 0.1702584 0.2571940
 
# Equivalent lm estimation
est_lm = lm(y ~ x1 + x2 + x3 + species, base)

# Now: almost equivalence. They are proportional.
vcov(est_felm) / vcovCL(est_lm, cluster = base$species)[2:4, 2:4]
#>           x1        x2        x3
#> x1 0.9863014 0.9863014 0.9863014
#> x2 0.9863014 0.9863014 0.9863014
#> x3 0.9863014 0.9863014 0.9863014

As you can see, you need to first estimate a lm model, then compute the clustered VCOV with sandwich. You end up with VCOVs which are proportional, they only differ in the small sample adjustment.

If you want to have several SEs for a single model, you can have a look at an alternative package to perform OLS and GLM estimations with multiple FEs: fixest.

Using fixest

The package fixest is compatible with sandwich. Here's an example of equivalences:

library(fixest)

est_feols = feols(y ~ x1 + x2 + x3 | species, base)

# feols automatically clusters the SEs
# => proportional VCOVs
vcov(est_feols) / vcovCL(est_feols, cluster = base$species)
#>          x1       x2       x3
#> x1 1.020548 1.020548 1.020548
#> x2 1.020548 1.020548 1.020548
#> x3 1.020548 1.020548 1.020548

# Equivalences -- I:
vcov(est_feols, dof = dof(fixef.K = "none")) / vcovCL(est_feols, cluster = base$species,
                                                      type = "HC1")
#>    x1 x2 x3
#> x1  1  1  1
#> x2  1  1  1
#> x3  1  1  1 

# Equivalences -- II:
vcov(est_feols, dof = dof(adj = FALSE)) / vcovCL(est_feols, cluster = base$species)
#>    x1 x2 x3
#> x1  1  1  1
#> x2  1  1  1
#> x3  1  1  1

For details on how SEs are computed in fixest and how to obtain equivalent SEs from alternative software, please have a look at this vignette: On standard-errors.

Upvotes: 1

Related Questions