Nader Mehri
Nader Mehri

Reputation: 556

Getting means of a several columns after grouping them

Using the below code, I am trying to group my data (Diss) by "gender and CG_less14" and then obtain the means of columns 5 to 29. Then I would like to round the means to the nearest decimals. I would like to print the means in the console so I can manually use them for further analyses.

I got an error: Error in t(., round(colMeans(Diss[, 5:29]), 2)) : unused argument (round(colMeans(Diss[, 5:29]), 2))

 Diss %>%
   group_by(gender, CG_less14) %>%
         t(round(colMeans(Diss[,5:29]),2))

Upvotes: 1

Views: 154

Answers (1)

akrun
akrun

Reputation: 887118

With dplyr, we can use summarise_at

library(dplyr)
Diss %>%
    group_by(gender, CG_less14) %>%
    summarise_at(5:29, ~ round(mean(.), 2))

In base R, we can use aggregate

aggregate(.~ gender + CG_less14, Diss, function(x) round(mean(x), 2))

A reproducible example with iris

iris %>%
     group_by(Species) %>% 
     summarise_at(1:2, ~ round(mean(.), 2))
# A tibble: 3 x 3
#  Species    Sepal.Length Sepal.Width
#  <fct>             <dbl>       <dbl>
#1 setosa             5.01        3.43
#2 versicolor         5.94        2.77
#3 virginica          6.59        2.97

Note that after we do the group_by, the data can be accessed with .data or with ., if we use the original data object to subset, it will get disrupt the grouping process and instead get the whole column

If we want to use colMeans, an option is to split the data by the grouping variable with group_split, loop over the list, select the columns of interest and apply the colMeans

library(purrr)
iris %>%
   group_split(Species, keep = FALSE) %>%
   map_dfr(~ .x %>% 
                select(1:2) %>%
                colMeans %>% 
                round(2))

Upvotes: 2

Related Questions