Reputation: 39
Hello fellow overflowers,
currently I'm trying to calculate means for multiple groups. My df looks like this (~600 rows):
col1 col2 col3 col4 col5
<type> <gender> <var1> <var2> <var3>
1 A 1 3 2 3
2 A 2 NA 5 NA
3 A 1 3 3 5
4 B 1 4 NA 1
5 B 2 3 4 5
Now the result should look like this:
col1 col2 col3 col4 col5
<type> <gender> <mean-var1> <mean-var2> <mean-var3>
1 A 1 3.6 4.1 4.6
2 A 2 4.1 3.8 4.2
3 B 1 3.9 4.2 3.7
4 B 2 4.3 3.2 2.7
5 C 1 3.5 4.5 3.6
6 C 2 4 3.7 4.2
...
So far, I've tried to use the group_by
function:
avg_values<-data%>%
group_by(type, gender) %>%
summarize_all (mean())
So far, it didn't work out. Could you help me figure out a good way to handle this?
Upvotes: 0
Views: 151
Reputation: 11584
Does this work:
library(dplyr)
df %>% group_by(type, gender) %>% summarise(across(var1:var3, ~ mean(., na.rm = T)))
`summarise()` regrouping output by 'type' (override with `.groups` argument)
# A tibble: 4 x 5
# Groups: type [2]
type gender var1 var2 var3
<chr> <dbl> <dbl> <dbl> <dbl>
1 A 1 3 2.5 4
2 A 2 NaN 5 NaN
3 B 1 4 NaN 1
4 B 2 3 4 5
Data used:
df
# A tibble: 5 x 5
type gender var1 var2 var3
<chr> <dbl> <dbl> <dbl> <dbl>
1 A 1 3 2 3
2 A 2 NA 5 NA
3 A 1 3 3 5
4 B 1 4 NA 1
5 B 2 3 4 5
Upvotes: 1