user43953
user43953

Reputation: 109

Regression table with clustered standard errors in R jupyter notebook?

I'm using export_summs in R to make a regression table, but when I use coeftest to get clustered standard errors, the table no longer reports N or R^2 properly in those columns. The coefficients and standard errors look good, just missing those additional stats. (I'm used to outreg2 in Stata which is much simpler.)

I tried using tidy_override() as suggested in the last example here (https://hughjonesd.github.io/huxtable/huxreg.pdf), no change.

# Reproducible example
datareg <- NULL
datareg$y <- rnorm(1000)
datareg$x <- rnorm(1000)
datareg$cluster_var <- rnorm(1000)
datareg <- data.frame(datareg)

reg0 <- lm(y ~ x
           , data = datareg)

reg1 <- coeftest(
            lm(y ~ x
           , data = datareg)
            , vcovCL, cluster = datareg$cluster_var)

export_summs(reg0, reg1,
   model.names = c("Basic", "Cluster SE"))

Issues warning and output:

enter image description here

Upvotes: 0

Views: 681

Answers (2)

dash2
dash2

Reputation: 2262

Huxtable author here. This is how to do it with tidy_override:

library(generics)
library(huxtable)
library(jtools)
library(lmtest)
library(sandwich)

datareg <- NULL
datareg$y <- rnorm(1000)
datareg$x <- rnorm(1000)
datareg$cluster_var <- rnorm(1000)
datareg <- data.frame(datareg)

reg0 <- lm(y ~ x, data = datareg)

reg1 <- coeftest(reg0, vcovCL, cluster = datareg$cluster_var)

reg1 <- tidy_override(reg1, glance = list(nobs = 1000L, r.squared = 0.000), 
      extend = TRUE) # extend = TRUE is important
export_summs(reg0, reg1, model.names = c("Basic", "Cluster SE"))

Which gives:

────────────────────────────────────────────────────
                       Basic          Cluster SE    
                 ───────────────────────────────────
  (Intercept)              -0.01            -0.01   
                           (0.03)           (0.03)  
  x                        -0.05            -0.05   
                           (0.03)           (0.03)  
                 ───────────────────────────────────
  N                      1000             1000      
  R2                        0.00             0.00   
────────────────────────────────────────────────────
  *** p < 0.001; ** p < 0.01; * p < 0.05.           

Column names: names, Basic, Cluster SE

This was fairly tricky and I appreciate your difficulties... I have improved the error reporting in huxreg as a result!

Upvotes: 1

paqmo
paqmo

Reputation: 3739

This is a case where the error message is fairly clear: the broom package does not have a glance method for coeftest objects. This is not an accident--the nature of the coeftest object does not allow for broom to calculate model summary statistics. It retains very little information about the original model:

> str(reg1)
 'coeftest' num [1:2, 1:4] 0.0483 0.0153 0.0329 0.0341 1.4668 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "(Intercept)" "x"
  ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
 - attr(*, "method")= chr "t test of coefficients"
 - attr(*, "df")= int 998

One option is to use the lm_robust function from the estimatr package. It returns objects with robust standard errors that are amenable to both glance and tidy:

 reg2 <- estimatr::lm_robust(y ~ x
            , data = datareg)
 export_summs(reg0, reg2,
    model.names = c("Basic", "Cluster SE"), number_format = NA )


──────────────────────────────────────────────────────────────────
                         Basic                  Cluster SE        
              ────────────────────────────────────────────────────
  (Intercept)      0.0482678107925753        0.0482678107925755   
                  (0.032842483472098)       (0.0329070612421128)  
  x                0.0152928320138191        0.015292832013819    
                  (0.0333488383365212)      (0.034094868727288)   
              ────────────────────────────────────────────────────
  N             1000                      1000                    
  R2               0.000210664993144995      0.000210665          
──────────────────────────────────────────────────────────────────
  *** p < 0.001; ** p < 0.01; * p < 0.05.                         

Column names: names, Basic, Cluster SE

Upvotes: 2

Related Questions