mavericks
mavericks

Reputation: 1165

modelsummary/kableExtra regression table with models of the same name

I use modelsummary() with kableExtra() to generate a regression table in an Rmd file (final output format: LaTex and HTML).

I run regressions for several variable combinations and model specifications. The regressions are grouped in the table by variable combinations via kable::add_header_above().

For different variable combinations, I run the same models (e.g. OLS & Poisson, or other). To improve readability I would, therefore, like to name the models simply as such, e.g.

names(models) <- c("OLS", "Poisson", "OLS", "Poisson", ...)

instead of

names(models) <- c("OLS 1", "Poisson 1", "OLS 2", "Poisson 2", ...)

However, modelsummary() somehow does not permit the regressions to be named the same, resulting in the following errors:

Error: Can't bind data because some arguments have the same name
Backtrace:
  1. modelsummary::msummary(...)
  2. modelsummary::extract(...)
 10. dplyr::mutate(., group = "gof")
 12. dplyr:::mutate_cols(.data, ...)
 13. DataMask$new(.data, caller_env())
 14. .subset2(public_bind_env, "initialize")(...)
 17. rlang::env_bind_lazy(...)
 18. rlang:::env_bind_impl(.env, exprs, "env_bind_lazy()", TRUE, binder)

and

 Error in htmlTable_add_header_above(kable_input, header, bold, italic,  : 
 The new header row you provided has a different total number of columns with the original `kabel()` output.

MWE:

library(modelsummary)
library(kableExtra)

url <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'
dat <- read.csv(url)

models <- list()
models[['OLS']] <- lm(Crime_prop ~ Literacy, data = dat)
models[['Poisson']] <- glm(Crime_prop ~ Literacy + Clergy, family = poisson, data = dat)
models[['OLS']] <- lm(Crime_pers ~ Literacy, data = dat)
models[['Poisson']] <- glm(Crime_pers ~ Literacy + Clergy, family = poisson, data = dat)

# build table with `modelsummary` 
cm <- c( '(Intercept)' = 'Constant', 'Literacy' = 'Literacy (%)', 'Clergy' = 'Priests/capita')
cap <- 'A modelsummary table customized with kableExtra'

tab <- msummary(models, output = 'kableExtra',
                coef_map = cm, stars = TRUE,
                title = cap, gof_omit = 'IC|Log|Adj')

# customize table with `kableExtra`
tab %>%
  
  # column labels
  add_header_above(c(" " = 1, "Crimes (property)" = 2, "Crimes (person)" = 2))

AddOn:

One workaround is to add a space " " to the model name, prior to building the table with modelsummary:

names(models) <- c("OLS", "Poisson", "OLS ", "Poisson ", ...)

Manually this is easily feasible for few model specifications and variable combinations. However, a solution that could dynamically adapt to the given settings would be preferred, i.e. to suit also cases as the following:

names(models) <- c("OLS", "Poisson", "GLM", "Poisson", ...)

instead of

names(models) <- c("OLS 1", "Poisson 1", "GLM 2", "Poisson 2", ...)

UPDATE:

With the updated package version made available by @Vincent, regression tables with models of the same name can easily be implemented also for models stored in nested lists, e.g. if they are added to sublists in a loop or via lapply(..., FUN).

models <- NA
models <- list()
models[["a"]][["OLS"]] <- lm(Crime_prop ~ Literacy, data = dat)
models[["a"]][["Poisson"]] <- glm(Crime_prop ~ Literacy + Clergy, family = poisson, data = dat)
models[["b"]][["OLS"]] <- lm(Crime_pers ~ Literacy, data = dat)
models[["b"]][["Poisson"]] <- glm(Crime_pers ~ Literacy + Clergy, family = poisson, data = day)
# ...

models_unlisted <- unlist(models, recursive=FALSE)
names(models_unlisted) <- c('ols', 'poisson', 'ols', 'poisson')

cm <- c( '(Intercept)' = 'Constant', 'Literacy' = 'Literacy (%)', 'Clergy' = 'Priests/capita')

msummary(models_unlisted, output = 'kableExtra', statistic_vertical = FALSE,
         coef_map = cm, stars = TRUE, gof_omit = 'IC|Log|Adj') %>%
  add_header_above(c(" " = 1, "Crimes (property)" = 2, "Crimes (person)" = 2))

modelsummary regression table output

Upvotes: 1

Views: 2872

Answers (2)

Vincent
Vincent

Reputation: 17725

Thanks for the question. The other poster is right: your solution under MWE will never work because it relates to a fundamental feature of the R language. Assigning to the same name in a list overwrites the previous value. See:

a <- list()
a['blah'] <- 1
a['blah'] <- 2
a

The easiest trick I know is the one already proposed: add a space after names. This has one main disadvantage: it makes it harder to use select columns by names to customize them with gt or kableExtra. But aside from that it is quite innocuous, since all table-making packages strip out the white space before displaying the table.

After reading your question, I added a line of code to modelsummary to "pad" model names automatically. If you install from Github (I'll release to CRAN soon), you should be able to run this:

library(remotes)
install_github('vincentarelbundock/modelsummary')

library(modelsummary)
library(kableExtra)

url <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'
dat <- read.csv(url)

models <- list()
models[[1]] <- lm(Crime_prop ~ Literacy, data = dat)
models[[2]] <- glm(Crime_prop ~ Literacy + Clergy, family = poisson, data = dat)
models[[3]] <- lm(Crime_pers ~ Literacy, data = dat)
models[[4]] <- glm(Crime_pers ~ Literacy + Clergy, family = poisson, data = dat)
names(models) <- c('ols', 'poisson', 'ols', 'poisson')

cm <- c( '(Intercept)' = 'Constant', 'Literacy' = 'Literacy (%)', 'Clergy' = 'Priests/capita')
cap <- 'A modelsummary table customized with kableExtra'

msummary(models, output = 'kableExtra',
         coef_map = cm, stars = TRUE,
         title = cap, gof_omit = 'IC|Log|Adj') %>%
       add_header_above(c(" " = 1, "Crimes (property)" = 2, "Crimes (person)" = 2))

PS: please open an issue on Github if you have feature requests: https://github.com/vincentarelbundock/modelsummary/issues

Upvotes: 1

Chris
Chris

Reputation: 286

At the moment the 3rd and 4th models in your MWE overwrite the first two so there are only two elements in the models list, which then gives you the different total number of columns error.

If it is just readability you are after you could add a space after the name in the 3rd and 4th model and the rest should display nicely.

models[['OLS ']] <- lm(Crime_pers ~ Literacy, data = dat)
models[['Poisson ']] <- glm(Crime_pers ~ Literacy + Clergy, family = poisson, data = dat)

Upvotes: 1

Related Questions