Doug Fir
Doug Fir

Reputation: 21212

Frequency table but custom function instead of default count?

Suppose I have a data frame:

bla <- data.frame(
  a = c(1,1,1,0,0,1,1,1,0,0),
  b = c(0,0,0,1,1,0,0,1,1,0),
  c = c(1,0,1,0,1,0,1,0,1,0),
  d = c(2,3,4,7,8,6,5,2,1,0)
)

I can use table() to get the counts of each combination of 1/0 for each of a, b and c:

table(bla %>% select(a:c)) %>% as.data.frame()

  a b c Freq
1 0 0 0    1
2 1 0 0    2
3 0 1 0    1
4 1 1 0    1
5 0 0 1    0
6 1 0 1    3
7 0 1 1    2
8 1 1 1    0

Here's my question, is there a approach to get back both the frequency AND the mean of column d for each combination of a, b and c?

I.e. it looks like table() auto groups by each distinct combination then returns count() (Freq field). Can I do the same but add mean()?

Upvotes: 1

Views: 55

Answers (2)

tmfmnk
tmfmnk

Reputation: 39858

If you want also the non-present combinations, with dplyr and tidyr you can do:

bla %>%
 complete(a, b, c) %>%
 group_by_at(1:3) %>%
 summarise(count = sum(!is.na(d)),
           mean = mean(d))

      a     b     c count  mean
  <dbl> <dbl> <dbl> <dbl> <dbl>
1     0     0     0     1  0   
2     0     0     1     0 NA   
3     0     1     0     1  7   
4     0     1     1     2  4.5 
5     1     0     0     2  4.5 
6     1     0     1     3  3.67
7     1     1     0     1  2   
8     1     1     1     0 NA   

Upvotes: 2

bouncyball
bouncyball

Reputation: 10761

Here's a base R solution using aggregate:

aggregate(d ~ ., data = bla, 
          FUN = function(x) c('mean' = mean(x), 'count' = length(x)))

And, the dplyr package could also be handy (this would be my preference):

library(dplyr)
bla %>%
    group_by(a, b, c) %>% # or group_by_at(-vars(d))
    summarise(count = n(),
              mean_d = mean(d))

Upvotes: 3

Related Questions