Lachlan
Lachlan

Reputation: 175

Perform a different simple custom function based on group

I have data with three groups and would like to perform a different custom function on each of the three groups. Rather than write three separate functions, and calling them all separately, I'm wondering whether I can easily wrap all three into one function with a 'group' parameter.

For example, say I want the mean for group A:

library(tidyverse)

data(iris)

iris$Group <- c(rep("A", 50), rep("B", 50), rep("C", 50))

f_a <- function(df){
  out <- df %>% 
    group_by(Species) %>% 
    summarise(mean = mean(Sepal.Length))
  return(out)
}

The median for group B

f_b <- function(df){
  out <- df %>% 
    group_by(Species) %>% 
    summarise(median = median(Sepal.Length))
  return(out)
}

And the standard deviation for group C

f_c <- function(df){
  out <- df %>% 
    group_by(Species) %>% 
    summarise(sd= sd(Sepal.Length))
  return(out)
}

Is there any way I can combine the above functions and run them according to a group parameter?? Like: fx(df, group = "A") Which would produce the results of the above f_a function??

Keeping in mind that in my actual use context, I can't simply group_by(group) in the original function, since the actual functions are more complex. Thanks!!

Upvotes: 1

Views: 64

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388982

I don't understand the point of having group column in the dataset. When we pass group = "A" in the function this has got nothing to do with group column that was created.

Instead of passing group = "A" in the function and then mapping A to some function you can directly pass the function that you want to apply.

library(dplyr)

f_a <- function(df, fn){
  out <- df %>% 
          group_by(Species) %>% 
          summarise(out = fn(Sepal.Length))
  return(out)
}

f_a(iris, mean)

# A tibble: 3 x 2
#  Species      out
#* <fct>      <dbl>
#1 setosa      5.01
#2 versicolor  5.94
#3 virginica   6.59

f_a(iris, median)
# A tibble: 3 x 2
#  Species      out
#* <fct>      <dbl>
#1 setosa       5  
#2 versicolor   5.9
#3 virginica    6.5

Upvotes: 0

akrun
akrun

Reputation: 887158

We create a switch inside the function to select the appropriate function to be applied based on the matching input from group. This function is passed into summarise to apply after groupihg by 'Species'

fx <- function(df, group) {
           fn_selector <- switch(group,
                        A = "mean",
                        B = "median",
                       C = "sd")
          
                       
                       
             
          df %>%
             group_by(Species) %>%
             summarise(!! fn_selector :=
             match.fun(fn_selector)(Sepal.Length), .groups = 'drop')
        }

-testing

fx(iris, "A")
# A tibble: 3 x 2
#  Species     mean
#  <fct>      <dbl>
#1 setosa      5.01
#2 versicolor  5.94
#3 virginica   6.59
 
fx(iris, "B")
# A tibble: 3 x 2
#  Species    median
#  <fct>       <dbl>
#1 setosa        5  
#2 versicolor    5.9
#3 virginica     6.5

fx(iris, "C")
# A tibble: 3 x 2
#  Species       sd
#  <fct>      <dbl>
#1 setosa     0.352
#2 versicolor 0.516
#3 virginica  0.636

Upvotes: 1

Related Questions