Reputation: 175
I have data with three groups and would like to perform a different custom function on each of the three groups. Rather than write three separate functions, and calling them all separately, I'm wondering whether I can easily wrap all three into one function with a 'group' parameter.
For example, say I want the mean for group A:
library(tidyverse)
data(iris)
iris$Group <- c(rep("A", 50), rep("B", 50), rep("C", 50))
f_a <- function(df){
out <- df %>%
group_by(Species) %>%
summarise(mean = mean(Sepal.Length))
return(out)
}
The median for group B
f_b <- function(df){
out <- df %>%
group_by(Species) %>%
summarise(median = median(Sepal.Length))
return(out)
}
And the standard deviation for group C
f_c <- function(df){
out <- df %>%
group_by(Species) %>%
summarise(sd= sd(Sepal.Length))
return(out)
}
Is there any way I can combine the above functions and run them according to a group parameter?? Like:
fx(df, group = "A")
Which would produce the results of the above f_a function??
Keeping in mind that in my actual use context, I can't simply group_by(group) in the original function, since the actual functions are more complex. Thanks!!
Upvotes: 1
Views: 64
Reputation: 388982
I don't understand the point of having group
column in the dataset. When we pass group = "A"
in the function this has got nothing to do with group
column that was created.
Instead of passing group = "A"
in the function and then mapping A
to some function you can directly pass the function that you want to apply.
library(dplyr)
f_a <- function(df, fn){
out <- df %>%
group_by(Species) %>%
summarise(out = fn(Sepal.Length))
return(out)
}
f_a(iris, mean)
# A tibble: 3 x 2
# Species out
#* <fct> <dbl>
#1 setosa 5.01
#2 versicolor 5.94
#3 virginica 6.59
f_a(iris, median)
# A tibble: 3 x 2
# Species out
#* <fct> <dbl>
#1 setosa 5
#2 versicolor 5.9
#3 virginica 6.5
Upvotes: 0
Reputation: 887158
We create a switch
inside the function to select the appropriate function to be applied based on the matching input from group
. This function is passed into summarise
to apply after groupihg by 'Species'
fx <- function(df, group) {
fn_selector <- switch(group,
A = "mean",
B = "median",
C = "sd")
df %>%
group_by(Species) %>%
summarise(!! fn_selector :=
match.fun(fn_selector)(Sepal.Length), .groups = 'drop')
}
-testing
fx(iris, "A")
# A tibble: 3 x 2
# Species mean
# <fct> <dbl>
#1 setosa 5.01
#2 versicolor 5.94
#3 virginica 6.59
fx(iris, "B")
# A tibble: 3 x 2
# Species median
# <fct> <dbl>
#1 setosa 5
#2 versicolor 5.9
#3 virginica 6.5
fx(iris, "C")
# A tibble: 3 x 2
# Species sd
# <fct> <dbl>
#1 setosa 0.352
#2 versicolor 0.516
#3 virginica 0.636
Upvotes: 1