Mike Wise
Mike Wise

Reputation: 22827

R function with expression as parameter for dplyr summarise

Okay, this is something that feels like it should be relatively easy, but although I have tried literally dozens of approaches with quote, eval, substitute, enquote, parse, summarize_ etc... I haven't gotten it to work. Basically I am trying to calculate something like this - but with a variable expression for the summarise argument:

mtcars %>% group_by(cyl) %>% summarise(wt=mean(wt),hp=mean(hp))

yielding:

# A tibble: 3 × 3
    cyl       wt        hp   
    <dbl>    <dbl>     <dbl> 
1     4 2.285727  82.63636 
2     6 3.117143 122.28571 
3     8 3.999214 209.21429

One of the things I tried was:

  x2 <- "wt=mean(wt),hp=mean(hp)"
  mtcars %>% group_by(cyl) %>% summarise(eval(parse(text=x2)))

yielding:

Error in eval(substitute(expr), envir, enclos) : 
  <text>:1:12: unexpected ','
1: wt=mean(wt),

But leaving away the second argument (",hp=mean(hp") gets you no further:

> x2 <- "wt=mean(wt)"
> mtcars %>% group_by(cyl) %>% summarise(eval(parse(text=x2)))
Error in eval(substitute(expr), envir, enclos) : object 'wt' not found

I will spare you all the other things I tried - I am clearly missing something about how expressions get handled in function arguments.

So what is the proper approach here? Keeping in mind I really want something like this in the end:

getdf <- function(df,sumarg){
  df %>% group_by(cyl) %>% summarise(sumarg)
  df
}

Also not sure what kind of tag I should use for this kind of query in the R world. Metaprogramming?

Upvotes: 2

Views: 426

Answers (1)

Axeman
Axeman

Reputation: 35377

For maximum flexibility I would use a ... argument, capture those dots use lazyeval, and then pass to summarise_:

getdf <- function(df, ...){ 
    df %>% group_by(cyl) %>% summarise_(.dots = lazyeval::lazy_dots(...)) 
}

Then you can directly do:

getdf(mtcars, wt = mean(wt), hp = mean(hp))
# A tibble: 3 × 3
    cyl       wt        hp
  <dbl>    <dbl>     <dbl>
1     4 2.285727  82.63636
2     6 3.117143 122.28571
3     8 3.999214 209.21429

One way to do it without ..., is to pass arguments in a list, although you will need to use formulas or quoting. E.g.:

getdf2 <- function(df, args){ 
    dots <- lazyeval::as.lazy_dots(args)
    df %>% group_by(cyl) %>% summarise_(.dots = dots) 
}

And use as:

getdf(mtcars, list(wt = ~mean(wt), hp = ~mean(hp)))

or

getdf(mtcars, list(wt = "mean(wt)", hp = "mean(hp)"))

Upvotes: 4

Related Questions