How can I get the overall stats when using group_by in dplyr?

I am using dplyr to calculate some summary statistics across groups, but I would also like to get the same stats for all the data (in the same line of code)

So far I can only think of:

aux.1 <- iris %>% 
group_by(Species)  %>% 
summarise("stat1" = mean(Sepal.Length), 
          "stat2" = sum(Petal.Length) )

aux.2 <- iris %>% 
summarise("stat1" = mean(Sepal.Length), 
          "stat2" = sum(Petal.Length) )

Anyway I can get all the stats in one line of code?

Upvotes: 2

Views: 475

Answers (2)

eipi10
eipi10

Reputation: 93861

You need two separate dplyr chains, but you can put it all together with bind_rows:

aux <- bind_rows(
  iris %>% 
    group_by(Species)  %>% 
    summarise("stat1" = mean(Sepal.Length), 
              "stat2" = sum(Petal.Length)), 
  iris %>% 
    summarise("stat1" = mean(Sepal.Length), 
              "stat2" = sum(Petal.Length)) %>%
    mutate(Species = "All")
  )

aux
     Species    stat1 stat2
1     setosa 5.006000  73.1
2 versicolor 5.936000 213.0
3  virginica 6.588000 277.6
4        All 5.843333 563.7

Upvotes: 3

lmo
lmo

Reputation: 38510

In case you are interested in taking a look at the data.table package, this is easy to achieve:

library(data.table)
# have to make a copy of the internal data.frame for testing
irisTemp <- iris
setDT(irisTemp)

# calculate group statistics
irisTemp[, c("meanVal", "sumVal") := .(mean(Sepal.Length), sum(Petal.Length)),
         by="Species"]

This can be a quick and efficient library for large data sets.

Upvotes: 1

Related Questions