Reputation: 154
I have data which I want to group by one column and then summarise with means and counts by group for multiple columns. Some example data (my data has more columns and groups to be summarised):
df <- data.frame(
group = c("A", "A", "B", "B", "B", "B"),
var1 = c(623.3, 515.2, 611.0, 729.0, NA, 911.5),
var2 = c(42, 28, 43, 51, 26, 64),
stringsAsFactors = FALSE
)
print(df)
group var1 var2
1 A 623.30 42
2 A 515.20 28
3 B 611.00 43
4 B 729.00 51
5 B NA 26
6 B 911.5 64
I want a summary table grouped by group
which has means and counts for the other variables, with NAs ignored. It should look something like this:
group mean.var1 count.var1 mean.var2 count.var2
1 A 569.25 2 35 2
2 B 750.5 3 46 4
This is the preferable order, although variable names aren't important as long as it's obvious which variable and which function (mean or count) it refers to. Decimal places aren't important either.
Upvotes: 1
Views: 2159
Reputation: 887078
We may group by 'group' and summarise
across
the numeric columns to get the mean
and the count of non-NA (sum(!is.na
)
library(dplyr)
df %>%
group_by(group) %>%
summarise(across(where(is.numeric),
list(mean = ~ mean(.x, na.rm = TRUE), count = ~ sum(!is.na(.x)))))
Upvotes: 1