Reputation: 23680
I want to append the group-maximum to table of observations, e.g:
iris %>% split(iris$Species) %>%
lapply(function(l) mutate(l, species_max = max(Sepal.Width))) %>%
bind_rows() %>% .[c(1,51,101),]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species species_max
1 5.1 3.5 1.4 0.2 setosa 4.4
51 7.0 3.2 4.7 1.4 versicolor 3.4
101 6.3 3.3 6.0 2.5 virginica 3.8
Is there a more elegant dplyr::group_by
solution to achieve this?
Upvotes: 0
Views: 221
Reputation: 70336
How about this:
group_by(iris, Species) %>%
mutate(species_max = max(Sepal.Width)) %>%
slice(1)
# Source: local data frame [3 x 6]
# Groups: Species [3]
#
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species species_max
# <dbl> <dbl> <dbl> <dbl> <fctr> <dbl>
# 1 5.1 3.5 1.4 0.2 setosa 4.4
# 2 7.0 3.2 4.7 1.4 versicolor 3.4
# 3 6.3 3.3 6.0 2.5 virginica 3.8
The difficulty here is that you need to summarise multiple columns (for which summarise_all
would be great) but at the same time you need to add a new column (for which you either need a simple summarise
or mutate
call).
In this regard data.table
allows greater flexibility since it only relies on a list in its j
-argument. So you can do it as follows with data.table
, just as a comparison:
library(data.table)
dt <- as.data.table(iris)
dt[, c(lapply(.SD, first), species_max = max(Sepal.Width)), by = Species]
Upvotes: 1