Reputation: 1493
I am wondering whether there is a way using functional programming to repeat some operations on different subset of a data?
Below is an example of how I would do it "manually", but my question is: is there a way to apply the same formula to different subsets of the same dataset?
Here is a sample dataset:
dt <- data.frame(group = rep(LETTERS[1:3], each = 12*3),
year = rep(2018:2020, each = 12),
month = rep(1:12, times = 3),
value = rnorm(12*3*3, 2, .3))
And this is what I am doing right now. There are three ways of grouping (per group, per group AND per year, and per group and per year for a subset of the months). Then, the same action is carried out (summary with mean, min, max). The code below accomplishes what I want, but I wonder if there is a more efficient way to do this, ideally, using dplyr.
bind_rows(
# First grouping
dt %>% group_by(group) %>%
# Common summary
summarise(mean = mean(value),
min = min(value),
max = max(value)) %>%
mutate(grouping = "per group"),
# Second grouping
dt %>% group_by(group, year) %>%
# Common summary
summarise(mean = mean(value),
min = min(value),
max = max(value)) %>%
mutate(grouping = "per group and per year"),
# Third grouping
dt %>% filter (month %in% 6:8) %>% group_by(group, year) %>%
# Common summary
summarise(mean = mean(value),
min = min(value),
max = max(value)) %>%
mutate(grouping = "per group, summer months")
)
Any idea?
Upvotes: 1
Views: 120
Reputation: 12839
library(purrr)
library(dplyr)
groupings <- list(
. %>% group_by(group),
. %>% group_by(group, year),
. %>% filter (month %in% 6:8) %>% group_by(group, year)
)
grouping_labels <- list(
"per group",
"per group and per year",
"per group, summer months"
)
common_summary <- . %>%
summarise(mean = mean(value),
min = min(value),
max = max(value))
map2(
groupings,
grouping_labels,
~ dt %>% .x() %>% common_summary() %>% mutate(grouping = .y)
) %>%
bind_rows()
Upvotes: 3