Reputation: 178
I am attempting to create a function which takes a list as input, and returns a summarised data frame. However, after trying multiple ways, I am unable to pass a list to the function for the aggregation.
So far I have the following, but it is failing.
library(dplyr)
random_df <- data.frame(
region = c("A", "B", "C", "C"),
number_of_reports = c(1, 3, 2, 1),
report_MV = c(12, 33, 22, 12)
)
output_graph <- function(input) {
print(input$arguments)
DF <- input$DF
group_by <- input$group_by
args <- input$arguments
flow <- ddply(DF, group_by, summarize, args)
return(flow)
}
graph_functions <- list(
DF = random_df,
group_by = .(region),
arguments = .(Reports = sum(number_of_reports),
MV_Reports = sum(report_MV))
)
output_graph(graph_functions)
Where this works:
library(dplyr)
random_df <- data.frame(
region = c("A", "B", "C", "C"),
number_of_reports = c(1, 3, 2, 1),
report_MV = c(12, 33, 22, 12)
)
output_graph <- function(input) {
print(input$arguments)
DF <- input$DF
group_by <- input$group_by
args <- input$arguments
flow <- ddply(
DF,
group_by,
summarize,
Reports = sum(number_of_reports),
MV_Reports = sum(report_MV)
)
return(flow)
}
graph_functions <- list(
DF = random_df,
group_by = .(region),
arguments = .(Reports = sum(number_of_reports),
MV_Reports = sum(report_MV))
)
output_graph(graph_functions)
Would anyone be aware of a way to pass a list of functions to ddply
? Or another way to achieve the same goal of aggregating a dynamic set of variables.
Upvotes: 1
Views: 357
Reputation: 6264
In order to pass arguments into the function for use by dplyr
, I recommend reading this regarding non-standard evaluation (NSE). Here is an edited function producing the same output as my original.
library(dplyr)
random_df <- data.frame(
region = c('A','B','C','C'),
number_of_reports = c(1, 3, 2, 1),
report_MV = c(12, 33, 22, 12)
)
output_graph <- function(df, group, args) {
grp_quo <- enquo(group)
df %>%
group_by(!!grp_quo) %>%
summarise(!!!args)
}
args <- list(
Reports = quo(sum(number_of_reports)),
MV_Reports = quo(sum(report_MV))
)
output_graph(random_df, region, args)
# # A tibble: 3 x 3
# region Reports MV_Reports
# <fctr> <dbl> <dbl>
# 1 A 1.00 12.0
# 2 B 3.00 33.0
# 3 C 3.00 34.0
Upvotes: 1