Rilcon42
Rilcon42

Reputation: 9763

dplyr returns global mean for each group, instead of each groups mean

Can someone explain what I am doing wrong here:

library(dplyr)
temp<-data.frame(a=c(1,2,3,1,2,3,1,2,3),b=c(1,2,3,1,2,3,1,2,3))
temp%>%group_by(temp[,1])%>%summarise(n=n(),mean=mean(temp[,2],na.rm=T))

# A tibble: 3 × 3
  `temp[, 1]`     n  mean
        <dbl> <int> <dbl>
1           1     3     2
2           2     3     2
3           3     3     2

I expected the means to be:

1  1
2  2
3  3

instead the mean seems to be the global mean (all values in col 2 divided by the number of instances) = 18/9=2

How do I get the mean to be what I expected?

Upvotes: 2

Views: 255

Answers (3)

Jonathan von Schroeder
Jonathan von Schroeder

Reputation: 1703

Your problem is that you are calculating the mean of temp[,2]instead of the column in the group (mean(temp[,2],na.rm=T) does not depend on the context at all). You need to do the following:

> temp %>% group_by(temp[,1]) %>% summarise(n=n(), mean=mean(b, na.rm=T))
# A tibble: 3 × 3
  `temp[, 1]`     n  mean
        <dbl> <int> <dbl>
1           1     3     1
2           2     3     2
3           3     3     3

Furthermore it is more common to use the column name in the group_by as well:

> temp %>% group_by(b) %>% summarise(n=n(), mean=mean(b, na.rm=T))
# A tibble: 3 × 3
      b     n  mean
  <dbl> <int> <dbl>
1     1     3     1
2     2     3     2
3     3     3     3

Upvotes: 3

Khwaja n r
Khwaja n r

Reputation: 1

Always remember to use column names in dplyr. you will run into problems like these when you try to reference column by their index rather than name. so instead of the code you used

temp%>%group_by(temp[,1])%>%summarise(n=n(),mean=mean(temp[,2],na.rm=T))

Try the below this. gives the expected result

 temp%>%group_by(b)%>%summarise(n=n(),mean=mean(b))

Upvotes: 0

akrun
akrun

Reputation: 887223

An alternative approach is data.table

library(data.table)
setDT(temp)[, .(n = .N, mean = mean(b)), by = a]
#   a n mean
#1: 1 3    1
#2: 2 3    2
#3: 3 3    3

Upvotes: 1

Related Questions