Tyler Muth
Tyler Muth

Reputation: 1393

dplyr and Non-standard evaluation (NSE)

I'm trying to write a function that takes in the name of a data frame and a column to summarize by using dplyr, then returns the summarized data frame. I've tried a bunch of permutations of interp() from the lazyeval package, but I've spent way too much time trying to get it to work. So, I wrote a "static" version of the function I want here:

summarize.df.static <- function(){
  temp_df <- mtcars %>%
    group_by(cyl) %>%
    summarize(qsec = mean(qsec),
              mpg=mean(mpg))
  return(temp_df)
}

new_df <- summarize.df.static()
head(new_df)

Here is the start of the dynamic version I'm stuck on:

summarize.df.dynamic <- function(df_in,sum_metric_in){
  temp_df <- df_in %>%
    group_by(cyl) %>%
    summarize_(qsec = mean(qsec),
              sum_metric_in=mean(sum_metric_in)) # some mix of interp()
  return(temp_df)
}

new_df <- summarize.df.dynamic(mtcars,"mpg")
head(new_df)

Note that I want the column name in this example to come from the parameter passed-in as well (mpg in this case). Also note that the qsec column is static, ie not passed-in.

Below is the correct answer posted by "docendo discimus":

summarize.df.dynamic<- function(df_in, sum_metric_in){
  temp_df <- df_in %>%
    group_by(cyl) %>%
    summarize_(qsec = ~mean(qsec), 
               xyz = interp(~mean(var), var = as.name(sum_metric_in))) 

  names(temp_df)[names(temp_df) == "xyz"] <- sum_metric_in  
  return(temp_df)
}

new_df <- summarize.df.dynamic(mtcars,"mpg")
head(new_df)

#  cyl     qsec      mpg
#1   4 19.13727 26.66364
#2   6 17.97714 19.74286
#3   8 16.77214 15.10000

new_df <- summarize.df.dynamic(mtcars,"disp")
head(new_df)

#  cyl     qsec     disp
#1   4 19.13727 105.1364
#2   6 17.97714 183.3143
#3   8 16.77214 353.1000

Upvotes: 5

Views: 2072

Answers (3)

akrun
akrun

Reputation: 887088

Using the devel version of dplyr (and soon to be released 0.6.0 in April 2017), we can also make use the quosures

summarise.dfN <- function(df, expr) {
      expr <- enquo(expr) 
      colN <- quo_name(expr)
     df %>%
       group_by(cyl) %>%
       summarise(qsec = mean(qsec),
             !!colN := mean(!!expr))


  }

summarise.dfN(mtcars, mpg)
# A tibble: 3 × 3
#    cyl     qsec      mpg
#  <dbl>    <dbl>    <dbl>
#1     4 19.13727 26.66364
#2     6 17.97714 19.74286
#3     8 16.77214 15.10000

The enquo acts similar to substitute by returning the input value as a quosure while quo_name converts expression to string, we can unquote (!! or UQ) with in group_by/summarise/mutate etc. for evaluation.

As mentioned above, we can also pass the grouping variables as arguments

summarise.dfN2 <- function(df, expr, grpVar) {
  expr <- enquo(expr) 
  grpVar <- enquo(grpVar)
  colN <- quo_name(expr)
 df %>%
   group_by(!!grpVar) %>%
   summarise(qsec = mean(qsec),
         !!colN := mean(!!expr))


 }

summarise.dfN2(mtcars, mpg, cyl)
# A tibble: 3 × 3
#    cyl     qsec      mpg
#  <dbl>    <dbl>    <dbl>
#1     4 19.13727 26.66364
#2     6 17.97714 19.74286
#3     8 16.77214 15.10000

Upvotes: 3

talat
talat

Reputation: 70266

For the specific example (with static "qsec" etc) you could do:

library(dplyr)
library(lazyeval)
summarize.df <- function(data, sum_metric_in){
  data <- data %>%
    group_by(cyl) %>%
    summarize_(qsec = ~mean(qsec), 
               xyz = interp(~mean(var), var = as.name(sum_metric_in))) 

  names(data)[names(data) == "xyz"] <- sum_metric_in  
  data
}

summarize.df(mtcars, "mpg")
#Source: local data frame [3 x 3]
#
#  cyl     qsec      mpg
#1   4 19.13727 26.66364
#2   6 17.97714 19.74286
#3   8 16.77214 15.10000

AFAIK you cannot (yet?) supply the input "sum_metric_in" to dplyr::rename which you would typically use to rename the column, which is why I did it different in the example.

Upvotes: 7

shadow
shadow

Reputation: 22293

You could use paste or ~ to get a quote input that summarize_ understands.

df_in %>%
  group_by(cyl) %>%
  summarize_(qsec = ~mean(qsec),
             sum_metric_in=paste0('mean(', sum_metric_in, ')'))

Upvotes: 4

Related Questions