Reputation: 539
I am using dplyr
to calculate some summary statistics across groups, but I would also like to get the same stats for all the data (in the same line of code)
So far I can only think of:
aux.1 <- iris %>%
group_by(Species) %>%
summarise("stat1" = mean(Sepal.Length),
"stat2" = sum(Petal.Length) )
aux.2 <- iris %>%
summarise("stat1" = mean(Sepal.Length),
"stat2" = sum(Petal.Length) )
Anyway I can get all the stats in one line of code?
Upvotes: 2
Views: 475
Reputation: 93861
You need two separate dplyr
chains, but you can put it all together with bind_rows
:
aux <- bind_rows(
iris %>%
group_by(Species) %>%
summarise("stat1" = mean(Sepal.Length),
"stat2" = sum(Petal.Length)),
iris %>%
summarise("stat1" = mean(Sepal.Length),
"stat2" = sum(Petal.Length)) %>%
mutate(Species = "All")
)
aux
Species stat1 stat2 1 setosa 5.006000 73.1 2 versicolor 5.936000 213.0 3 virginica 6.588000 277.6 4 All 5.843333 563.7
Upvotes: 3
Reputation: 38510
In case you are interested in taking a look at the data.table
package, this is easy to achieve:
library(data.table)
# have to make a copy of the internal data.frame for testing
irisTemp <- iris
setDT(irisTemp)
# calculate group statistics
irisTemp[, c("meanVal", "sumVal") := .(mean(Sepal.Length), sum(Petal.Length)),
by="Species"]
This can be a quick and efficient library for large data sets.
Upvotes: 1