Reputation: 1372
I have dataset which shows Variables, calculation I want to perform (sum, no. of distinct values) and new variable names after the calculation.
library(dplyr)
RefDf <- read.table(text = "Variables Calculation NewVariable
Sepal.Length sum Sepal.Length2
Petal.Length n_distinct Petal.LengthNew
", header = T)
Manual Approach - Summarise by grouping of Species variable.
iris %>% group_by_at("Species") %>%
summarise(Sepal.Length2 = sum(Sepal.Length,na.rm = T),
Petal.LengthNew = n_distinct(Petal.Length, na.rm = T)
)
Automate via eval(parse( ))
x <- RefDf %>% mutate(Check = paste0(NewVariable, " = ", Calculation, "(", Variables, ", na.rm = T", ")")) %>% pull(Check)
iris %>% group_by_at("Species") %>% summarise(eval(parse(text = x)))
As of now it is returning -
Species `eval(parse(text = x))`
<fct> <int>
1 setosa 9
2 versicolor 19
3 virginica 20
It should return -
Species Sepal.Length2 Petal.LengthNew
<fct> <dbl> <int>
1 setosa 250. 9
2 versicolor 297. 19
3 virginica 329. 20
Upvotes: 3
Views: 302
Reputation: 21908
Updated I found a way of sparing those extra lines.
This is just another way of getting your desired result. I'd rather create a function call for every row of your data set and then iterate over it beside the new column names to get to the desired output:
library(dplyr)
library(rlang)
library(purrr)
# First we create a new variable which is actually of type call in your data set
RefDf %>%
rowwise() %>%
mutate(Call = list(call2(Calculation, parse_expr(Variables)))) -> Rf
Rf
# A tibble: 2 x 4
# Rowwise:
Variables Calculation NewVariable Call
<chr> <chr> <chr> <list>
1 Sepal.Length sum Sepal.Length2 <language>
2 Petal.Length n_distinct Petal.LengthNew <language>
# Then we iterate over `NewVariable` and `Call` at the same time to set the new variable
# name and also evaluate the `call` at the same time
map2(Rf$NewVariable, Rf$Call, ~ iris %>% group_by(Species) %>%
summarise(!!.x := eval_tidy(.y))) %>%
reduce(~ left_join(.x, .y, by = "Species"))
# A tibble: 3 x 3
Species Sepal.Length2 Petal.LengthNew
<fct> <dbl> <int>
1 setosa 250. 9
2 versicolor 297. 19
3 virginica 329. 20
Upvotes: 3
Reputation: 5456
You can use parse_exprs
:
library(tidyverse)
library(rlang)
RefDf <- read.table(text = "Variables Calculation NewVariable
Sepal.Length sum Sepal.Length2
Petal.Length n_distinct Petal.LengthNew
", header = T)
#
expr_txt <- set_names(str_c(RefDf$Calculation, "(", RefDf$Variables, ")"),
RefDf$NewVariable)
iris %>%
group_by_at("Species") %>%
summarise(!!!parse_exprs(expr_txt), .groups = "drop")
## A tibble: 3 x 3
#Species Sepal.Length2 Petal.LengthNew
#<fct> <dbl> <int>
#1 setosa 250. 9
#2 versicolor 297. 19
#3 virginica 329. 20
Upvotes: 3