Dee G
Dee G

Reputation: 154

R - create summary table of means and counts by group for multiple columns

I have data which I want to group by one column and then summarise with means and counts by group for multiple columns. Some example data (my data has more columns and groups to be summarised):

df <- data.frame( 
   group = c("A", "A", "B", "B", "B", "B"),
   var1 = c(623.3, 515.2, 611.0, 729.0, NA, 911.5), 
   var2 = c(42, 28, 43, 51, 26, 64),
   stringsAsFactors = FALSE
)
print(df)

     group       var1    var2
1     A        623.30    42
2     A        515.20    28 
3     B        611.00    43 
4     B        729.00    51 
5     B        NA        26
6     B        911.5     64

I want a summary table grouped by group which has means and counts for the other variables, with NAs ignored. It should look something like this:

     group    mean.var1    count.var1   mean.var2   count.var2
1     A        569.25         2             35          2
2     B        750.5          3             46          4

This is the preferable order, although variable names aren't important as long as it's obvious which variable and which function (mean or count) it refers to. Decimal places aren't important either.

Upvotes: 1

Views: 2159

Answers (1)

akrun
akrun

Reputation: 887078

We may group by 'group' and summarise across the numeric columns to get the mean and the count of non-NA (sum(!is.na)

library(dplyr)
df %>%
   group_by(group) %>% 
  summarise(across(where(is.numeric),
    list(mean = ~ mean(.x, na.rm = TRUE), count = ~ sum(!is.na(.x)))))

Upvotes: 1

Related Questions