johnsonzhj
johnsonzhj

Reputation: 527

How to run many regressions across rows and columns with vectorization

I want to run a series of linear regressions for multiple groups across columns. For the group stratification across rows, I can use the idea suggested here (Fitting several regression models with dplyr). In addition to that, I also need to regress them across different columns. See below the code I achieved with the loop. I wonder whether I can do both in a vectorized manner using the map function in package purrr together with the function of group_by in dplyr package and export the estimated beta coefficients and p values accordingly.

library(dplyr)
library(broom)
head(mtcars)

vec<-names(mtcars)[3:9]

data=NULL

for (i in 1:length(vec)){ 
df<-mtcars%>% 
    group_by(cyl)%>%
  do(  fit = lm( paste('mpg ~disp+',vec[i]), data = .)) 
  dfCoef = tidy(df, fit)
  res<-dfCoef %>% 
    filter(term=='disp')
  res$con=vec[i]
  data=bind_rows(data,res)
  }
data

Upvotes: 1

Views: 213

Answers (1)

stefan
stefan

Reputation: 124268

Using tidyr::(un)nest to perform the regressions by groups and a helper function this could be achieved like so:

library(dplyr)
library(broom)
library(tidyr)
library(purrr)

vec <- names(mtcars)[3:9]

lm_help <- function(vec) {
  mtcars %>% 
    tidyr::nest(data = -cyl) %>% 
    mutate(con = vec,
           fit = purrr::map(data, lm, formula = as.formula(paste0("mpg ~ disp + ", vec))),
           tidy = purrr::map(fit, tidy)) %>% 
    select(cyl, con, tidy) %>% 
    tidyr::unnest(tidy) %>% 
    filter(term == "disp")
}

purrr::map(vec, lm_help) %>% 
  bind_rows()
#> # A tibble: 21 x 7
#>      cyl con   term  estimate std.error statistic p.value
#>    <dbl> <chr> <chr>    <dbl>     <dbl>     <dbl>   <dbl>
#>  1     6 disp  disp   0.00361   0.0156     0.232  0.826  
#>  2     4 disp  disp  -0.135     0.0332    -4.07   0.00278
#>  3     8 disp  disp  -0.0196    0.00932   -2.11   0.0568 
#>  4     6 hp    disp   0.00180   0.0202     0.0890 0.933  
#>  5     4 hp    disp  -0.120     0.0369    -3.24   0.0120 
#>  6     8 hp    disp  -0.0186    0.00946   -1.97   0.0746 
#>  7     6 drat  disp   0.0224    0.0292     0.770  0.484  
#>  8     4 drat  disp  -0.133     0.0406    -3.27   0.0114 
#>  9     8 drat  disp  -0.0196    0.00977   -2.01   0.0697 
#> 10     6 wt    disp   0.0191    0.0109     1.75   0.154  
#> # ... with 11 more rows

Upvotes: 2

Related Questions