Why does the ``mean`` function not work properly with ``group_by %% summarise`` in a function environement?

Question

For example:

df <- data.frame("Treatment" = c(rep("A", 2), rep("B", 2)), "Price" = 1:4, "Cost" = 2:5)

I want to summarize the data by treatments for all the variables I have, and put them together, so I define a function to do this for each variable first, and then rbind them later on.

SummarizeFn <- function(x,y,z) {
                       df1 <- x %>% group_by(Treatment) %>% 
                       summarize(n = n(), Mean = mean(y), SD = sd(y)) %>% 
                       df1$Var = z # add a column to show which variable those statistics belong to. 
                   }
SumPrice <- SummarizeFn(df, df$Price, "Price")

However, the results are:

  Treatment     n  Mean    SD Var  
          
1 A             2   2.5  1.29 Price
2 B             2   2.5  1.29 Price

They are the mean and sd of all the observations, but not the grouped observations by Treatment. What is the problem here?

If I take the code out of the function environment, it works totally fine. Please help, thanks.

If you have a better way to achieve my purpose, that would be great! Thanks!

Ronak Shah · Accepted Answer

When you use variables with $ in dplyr pipes they do not respect grouping and work as if they are applied to the entire dataframe. Apart from that, you can use {{}} to evaluate column names in the functions.

library(dplyr)

SummarizeFn <- function(x,y,z) {
  x %>% 
    group_by(Treatment) %>% 
    summarize(n = n(), Mean = mean({{y}}), SD = sd({{y}}), Var = z)
}

SummarizeFn(df, Price, "Price")

#  Treatment     n  Mean    SD Var  
#          
#1 A             2   1.5 0.707 Price
#2 B             2   3.5 0.707 Price

Why does the ``mean`` function not work properly with ``group_by %>% summarise`` in a function environement?

Answers (2)

Related Questions

Why does the ``mean`` function not work properly with ``group_by %&gt;% summarise`` in a function environement?

Answers (2)

Related Questions

Why does the ``mean`` function not work properly with ``group_by %>% summarise`` in a function environement?