dreww2
dreww2

Reputation: 1611

Using summarize_all and summarize together in dplyr

Is there a way to combine summarize_all and summarize statements together in a dplyr chain? Something like this:

library(dplyr)

data(mtcars)

mtcars %>%
  group_by(cyl) %>%
  summarize_all(funs(mean(., na.rm=TRUE))) %>%
  summarize(n = n())

But of course that's not working because it's trying to summarize a summary.

Expected result is a single data.frame grouped by cyl, each column summarized by mean, and count of observations by cyl. I can do this by combining two separate summary statements using bind_cols, but is there a better way?

Thanks

Upvotes: 4

Views: 724

Answers (1)

tyluRp
tyluRp

Reputation: 4768

I think we could use add_count here:

library(dplyr)

mtcars %>% 
  add_count(cyl) %>% 
  group_by(cyl, n) %>% 
  summarise_all(.funs = mean, na.rm = TRUE)
# A tibble: 3 x 12
# Groups:   cyl [?]
    cyl     n   mpg  disp    hp  drat    wt  qsec    vs    am  gear  carb
  <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1    4.    11  26.7  105.  82.6  4.07  2.29  19.1 0.909 0.727  4.09  1.55
2    6.     7  19.7  183. 122.   3.59  3.12  18.0 0.571 0.429  3.86  3.43
3    8.    14  15.1  353. 209.   3.23  4.00  16.8 0.    0.143  3.29  3.50

Upvotes: 5

Related Questions