Getting two different means in R using same numbers

Question

I'm trying to calculate the mean of some grouped data, but I'm running into an issue where the mean generated using base::mean() is generating a different value than when I use base:rowMeans() or try to replicate the mean in Excel.

Here's the code with a simplified data frame looking at just a small piece of the data:

df <- data.frame("ID" = 1101372,
                 "Q1" = 5.996667,
                 "Q2" = 6.005556,
                 "Q3" = 5.763333)

avg1 <- df %>%
  summarise(new_avg = mean(Q1,
                           Q2,
                           Q3)) # Returns a value of 5.99667

avg2 <- rowMeans(df[,2:4]) # Returns a value of 5.921852

The value in avg2 is what I get when I use AVERAGE in Excel, but I can't figure out why mean() is not generating the same number.

Any thoughts?

akrun · Accepted Answer

Here, the mean is taking only the first argument i.e. Q1 as 'x' because the usage for ?mean is

mean(x, trim = 0, na.rm = FALSE, ...)

i.e. the second and third argument are different. In the OP's code, x will be taken as "Q1", trim as "Q2" and so on.. The ... at the end also means that the user can supply n number of parameters without any error and leads to confusions like this (if we don't check the usage)

We can specify the data as ., subset the columns of interest and use that in rowMeans

df %>% 
    summarise(new_avg = rowMeans(.[-1]))

This would be more efficient. But, if we want to use mean as such, then do a rowwise

df %>%
   rowwise() %>%
   summarise(new_avg = mean(c(Q1, Q2, Q3)))
# A tibble: 1 x 1
#  new_avg
#     
#1    5.92

Or convert to 'long' format and then do the group_by 'ID' and get the mean

library(dplyr)
library(tidyr)
df %>% 
     pivot_longer(cols = -ID) %>%
     group_by(ID) %>%  # can skip this step if there is only a single row
    summarise(new_avg = mean(value))
# A tibble: 1 x 2
#      ID new_avg
#       
#1 1101372    5.92

Getting two different means in R using same numbers

Answers (1)

Related Questions