Reputation: 524
Suppose I want to run a series of regressions, like so:
summary(lm(mpg ~ cyl, data = mtcars))
summary(lm(mpg ~ disp, data = mtcars))
summary(lm(mpg ~ wt, data = mtcars))
I want to create a data frame that contains the estimates and standard errors of each of these outputs, preferably with the variable name included. So the ultimate output should look like this:
Variable Beta Coeff
cyl -2.8 .32
disp -.04 .004
wt -5.3 .56
I presume it will require a function. Any ideas out there?
Upvotes: 2
Views: 154
Reputation: 887691
One option would be loop through the columns of interest, paste
to create a formula in lm
, tidy
the output, slice
away the first row, and select
the columns of interest
library(broom)
library(tidyverse)
map_df(c("cyl", "disp", "wt"), ~
lm(paste0("mpg ~ ", .x), data = mtcars) %>%
tidy %>%
slice(-1) %>%
select(Variable = term, Beta = estimate, Coeff = std.error))
# A tibble: 3 x 3
# Variable Beta Coeff
# <chr> <dbl> <dbl>
#1 cyl -2.88 0.322
#2 disp -0.0412 0.00471
#3 wt -5.34 0.559
Or using base R
t(sapply(c("cyl", "disp", "wt"), function(x)
summary(lm(paste0("mpg ~ ", x), data = mtcars))$coefficients[-1, 1:2]))
Upvotes: 1
Reputation: 206486
One easy way would be to use the purrr
and broom
packages in the tidyverse
.
library(purrr)
library(broom)
cols <- c("cyl", "disp", "wt")
map_df(cols, ~lm(reformulate(.x, "mpg"), data=mtcars) %>% tidy())
# term estimate std.error statistic p.value
# <chr> <dbl> <dbl> <dbl> <dbl>
# 1 (Intercept) 37.9 2.07 18.3 8.37e-18
# 2 cyl -2.88 0.322 -8.92 6.11e-10
# 3 (Intercept) 29.6 1.23 24.1 3.58e-21
# 4 disp -0.0412 0.00471 -8.75 9.38e-10
# 5 (Intercept) 37.3 1.88 19.9 8.24e-19
# 6 wt -5.34 0.559 -9.56 1.29e-10
This gives you some extra info but you could easily filter it out with dplyr
if you like.
Upvotes: 4