Reputation: 1
I would like to get a list of summarised tibbles obtained by many group_by_
dots in a dataframe.
require(tidyverse)
data(mtcars)
# create dots of groups for dplyr::group_by_ function
dots1 <- lapply(c("am", "gear"), as.symbol)
dots2 <- lapply(c("am", "carb"), as.symbol)
l <- list(dots1, dots2)
# group_by then summarise for one dots
mtcars %>%
group_by_(.dots = dots1) %>%
summarise(cyl_mean = mean(mpg),
cyl_sd = sd(mpg))
How to write the code allowing to get a list of n tibbles matching with n dots ? Something using purrr::map()
?
I would like to avoid copy-paste the code, since I would used many dots.
I tried
> mtcars %>% group_by_(.dots = l)
but it gave
Error: Can't convert a list to a quosure
I can get the desired output using a for
loop, but I was wondering if there is an alternative that do not use a for
loop.
list_groups <- list()
for (i in 1:length(l)) {
res <- mtcars %>%
group_by_(.dots = l[[i]]) %>%
summarise(cyl_mean = mean(mpg),
cyl_sd = sd(mpg))
list_groups[[i]] <- res
}
list_groups
[[1]]
# A tibble: 4 x 4
# Groups: am [?]
am gear cyl_mean cyl_sd
<dbl> <dbl> <dbl> <dbl>
1 0 3.00 16.1 3.37
2 0 4.00 21.0 3.07
3 1.00 4.00 26.3 5.41
4 1.00 5.00 21.4 6.66
[[2]]
# A tibble: 9 x 4
# Groups: am [?]
am carb cyl_mean cyl_sd
<dbl> <dbl> <dbl> <dbl>
1 0 1.00 20.3 1.93
2 0 2.00 19.3 3.74
3 0 3.00 16.3 1.05
4 0 4.00 14.3 3.36
5 1.00 1.00 29.1 5.06
6 1.00 2.00 27.0 4.30
7 1.00 4.00 19.3 3.00
8 1.00 6.00 19.7 NaN
9 1.00 8.00 15.0 NaN
Upvotes: 0
Views: 175
Reputation: 10671
This is how you could do it tidy-eval style using newer library(rlang)
functions to help with the non-standard evaluation:
library(tidyverse); library(rlang)
l2 <- list(c("am", "gear") ,c("am", "carb")) %>%
map(syms) # use syms() to capture multiple symbols instead
map(l, ~ group_by(mtcars, !!!.) %>% # use the !!! to eval the multiple symbols in the mtcars environment
summarise(cyl_mean = mean(mpg), # back to business as usual
cyl_sd = sd(mpg)))
Upvotes: 1
Reputation: 6695
You could use a simple lapply
approach:
list_groups <- lapply(l, function(x) mtcars %>%
group_by_(.dots = x) %>%
summarise(cyl_mean = mean(mpg),
cyl_sd = sd(mpg)))
[[1]]
# A tibble: 4 x 4
# Groups: am [?]
am gear cyl_mean cyl_sd
<dbl> <dbl> <dbl> <dbl>
1 0 3 16.10667 3.371618
2 0 4 21.05000 3.069745
3 1 4 26.27500 5.414465
4 1 5 21.38000 6.658979
[[2]]
# A tibble: 9 x 4
# Groups: am [?]
am carb cyl_mean cyl_sd
<dbl> <dbl> <dbl> <dbl>
1 0 1 20.33333 1.934770
2 0 2 19.30000 3.738449
3 0 3 16.30000 1.053565
4 0 4 14.30000 3.362539
5 1 1 29.10000 5.061620
6 1 2 27.05000 4.300000
7 1 4 19.26667 3.002221
8 1 6 19.70000 NaN
9 1 8 15.00000 NaN
Advantages: You don't need to create the list beforehand, the code is more concise and it is faster.
Unit: milliseconds
expr min lq mean median uq max neval cld
lapply_variant 5.130514 5.346550 5.980762 5.515367 5.776913 209.59276 1000 a
for_loop_variant 9.298755 9.787714 10.457785 10.064051 10.485171 37.54062 1000 b
Upvotes: 1