Reputation: 3805
Suppose a mathematical operation that I need to do is specified as a character vector
math.operation <- 'mean' # this could be mean, sum or length
I want to do apply this math.operation
on a column whose name is also provided as a string in dplyr
my.column <- 'col1'
dat <- data.frame(id = rep(1:4, each = 4),
col1 = 1:16,
col2 = 16:1)
I first selected the column based on my.column
and then added back my grouping variable which is id
and then tried to do the operation by group
dat %>% dplyr::select(contains(my.column)) %>%
dplyr::mutate(id = dat$id) %>%
dplyr::group_by(id) %>%
dplyr::summarise(match.fun(math.operation)(my.column))
I am stuck in the last line which is producing NAs
Upvotes: 1
Views: 108
Reputation: 18581
Option 1
You can use do.call
with !! sym()
. Note that I deleted your first select
and mutate
calls, since they seem to be redundant for this example.
Option 2
Instead of do.call
you could use call
, here you would not need to wrap the argument in list()
, but then you would need to use eval
, so the statement is not really shorter.
Option 3
A third option is to use your approach with match.fun
and !! sym()
which was missing in your example. However, I think do.call
is more straightforward.
Option 4
Finally you could use eval(parse(...))
, but the first way using do.call
and !! sym()
is preferable.
library(dplyr)
math.operation <- 'mean' # this could be mean, sum or length
my.column <- 'col1'
dat <- data.frame(id = rep(1:4, each = 4),
col1 = 1:16,
col2 = 16:1)
# Option 1
dat %>%
dplyr::group_by(id) %>%
dplyr::summarise(newvar = do.call(math.operation, list(!! sym(my.column))))
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 4 x 2
#> id newvar
#> <int> <dbl>
#> 1 1 2.5
#> 2 2 6.5
#> 3 3 10.5
#> 4 4 14.5
# Option 2
dat %>%
dplyr::group_by(id) %>%
dplyr::summarise(newvar = eval(call(math.operation, !! sym(my.column))))
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 4 x 2
#> id newvar
#> <int> <dbl>
#> 1 1 2.5
#> 2 2 6.5
#> 3 3 10.5
#> 4 4 14.5
# Option 3
dat %>%
dplyr::group_by(id) %>%
dplyr::summarise(newvar = match.fun(math.operation)(!! sym(my.column)))
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 4 x 2
#> id newvar
#> <int> <dbl>
#> 1 1 2.5
#> 2 2 6.5
#> 3 3 10.5
#> 4 4 14.5
# Option 4
dat %>%
dplyr::group_by(id) %>%
dplyr::summarise(newvar = eval(parse(text = paste0(math.operation, "(", my.column , ")"))))
#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 4 x 2
#> id newvar
#> <int> <dbl>
#> 1 1 2.5
#> 2 2 6.5
#> 3 3 10.5
#> 4 4 14.5
Created on 2020-07-08 by the reprex package (v0.3.0)
Upvotes: 1