awcm0n
awcm0n

Reputation: 23

Passing an argument to a regression model inside a function that uses dplyr R

I wrote a function to run a univariable regression on a filtered data set. The function takes as arguments a value used for filtering and the name of the predictor for the regression model. As you can see, I am struggling with data masking and evaluation. How do I use the .pred argument directly in the regression model? Thanks!

pacman::p_load(tidyverse, purrr, broom)
data("mtcars")

# my function
regr_func <- function(.cyl, .pred){
  
  mtcars %>% 
    filter(cyl == .cyl) %>%  # cars with .cyl cylinders
    mutate(x = .data[[.pred]]) %>%  # this is a bit of a hack :(
    lm(mpg ~ x, data = .) %>% 
    tidy() %>% 
    mutate(predictor = .pred,
           cylinders = .cyl)
}

regr_func(4, "hp")
#> # A tibble: 2 × 7
#>   term        estimate std.error statistic   p.value predictor cylinders
#>   <chr>          <dbl>     <dbl>     <dbl>     <dbl> <chr>         <dbl>
#> 1 (Intercept)   36.0      5.20        6.92 0.0000693 hp                4
#> 2 x             -0.113    0.0612     -1.84 0.0984    hp                4
Created on 2021-10-26 by the reprex package (v2.0.1)

Update

Thanks to Jon's tip, I could rewrite the function to pass the .pred argument directly to lm(), but now I can't pipe the data into lm(), so I had to create a new data set inside the function.

regr_func1 <- function(.cyl, .pred){
  
  tmp <- mtcars %>% filter(cyl == .cyl)
  
  xsym <- rlang::ensym(.pred)
  rlang::inject( lm(mpg ~ !!xsym, data = tmp) ) %>% 
    tidy() %>% 
    mutate(cylinders = .cyl)
}

Upvotes: 2

Views: 830

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388982

You can create the formula on fly using as.formula or reformulate without breaking the pipe.

library(dplyr)
library(broom)

regr_func <- function(.cyl, .pred){
  
  mtcars %>% 
    filter(cyl == .cyl) %>%  
    lm(reformulate(.pred, 'mpg'), data = .) %>% 
    tidy() %>% 
    mutate(predictor = .pred,
           cylinders = .cyl)
}
regr_func(4, "hp")

#  term        estimate std.error statistic   p.value predictor cylinders
#  <chr>          <dbl>     <dbl>     <dbl>     <dbl> <chr>         <dbl>
#1 (Intercept)   36.0      5.20        6.92 0.0000693 hp                4
#2 hp            -0.113    0.0612     -1.84 0.0984    hp                4

Upvotes: 0

runr
runr

Reputation: 1146

Alternative approach, using glue library:

regr_func <- function(.cyl, .pred){
  require(glue)
  o <- 'mpg ~ {.pred}' %>% glue
  lm(o, data = mtcars %>% subset(cyl == .cyl))
}

Upvotes: 3

Related Questions