Bamqf
Bamqf

Reputation: 3542

Non standard evaluation of dplyr summarise_ leads to different results

I want call summarise_ of dplyr package in my function, here is what I tried but I get different for mean and median function, what's wrong with my approach?

library(dplyr)
df <- data.frame(A=c(1,2,3))
getMean <- function(df, col) {
  col <- as.symbol(col)
  df %>%
    summarise_(Mean = ~mean(col))
}

getMedian <- function(df, col) {
  col <- as.symbol(col)
  df %>%
    summarise_(Median = ~median(col))
}

getMean(df, 'A')
   Mean
1    2

getMedian(df, 'A')
Error: object 'A' not found 

Upvotes: 2

Views: 286

Answers (1)

akrun
akrun

Reputation: 886938

We can use lazyeval

library(lazyeval)
library(dplyr)
getMedian <- function(df, col) {
          df %>%
           summarise_(.dots= list(Median=interp(~median(v), v= as.name(col))))
    }

getMedian(df, 'A')
#  Median
#1      2

We could use a single function to do the mean, median etc. by using the function name as argument.

getFun <- function(df, col, func) {
      FUN <- match.fun(func)
      nm1 <- sub('^(.)', '\\U\\1', substitute(func), perl=TRUE)
      df %>%
        summarise_(interp(~FUN(v), v= as.name(col)))%>%
        setNames(., nm1)
}

getFun(df, 'A', median)
#  Median
#1      2
getFun(df, 'A', mean)
#  Mean
#1    2

getFun(df, 'A', var)
#  Var
#1   1

getFun(df, 'A', min)
#  Min
#1   1
getFun(df, 'A', max)
#  Max
#1   3

Upvotes: 4

Related Questions