Reputation: 521
The following toy data has 5 variables, X1
to X5
.
set.seed(123)
df <- data.frame(matrix(rnorm(500), 100, 5))
I want to perform specific operations on specific variables, using a named list of purrr-style lambda formulas
fun_list <- list(
X2 = ~ quantile(.x, c(0.1, 0.9), na.rm = TRUE),
X4 = ~ fivenum(.x, na.rm = TRUE)
)
How can I apply fun_list
to my df
according to its variable names?
I know rlang::as_function()
can convert a purrr-style formula into a R function. But I guess there is some function that is able to deal with purrr-style formulas intrinsically. Its usage might be
execute(fun_list, environment = df)
The expected output is
$X2
10% 90%
-1.289408 1.058432
$X4
[1] -2.465898194 -0.737146704 -0.003508661 0.693634712 2.571458146
Upvotes: 2
Views: 87
Reputation: 269291
1) Here is a base R solution. First we create a function, fo2fun
, which accepts a formula and outputs the corresponding function. Then execute
is a function with a one-statement body using Map
to apply it to each formula and list name/index.
fo2fun <- function(formula) {
f <- function(.x) {}
body(f) <- formula[[2]]
environment(f) <- environment(formula)
f
}
execute <- function(funs, envir = parent.frame()) {
Map(\(fo, index) fo2fun(fo)(envir[[index]]), funs, names(fun_list))
}
# test
expected <- list(
X2 = quantile(df$X2, c(0.1, 0.9), na.rm = TRUE),
X4 = fivenum(df$X4, na.rm = TRUE)
)
execute(fun_list, df) |> identical(expected)
## [1] TRUE
execute(fun_list, list2env(df)) |> identical(expected)
## [1] TRUE
list2env(df, .GlobalEnv)
execute(fun_list) |> identical(expected)
## [1] TRUE
2) This is the same as (1) except we have used match.funfn
from the gsubfn package in place of fo2fun
.
With this approach the formal argument is not restricted to be .x
but rather match.funfn
assumes that any free variable found in the formula is the argument. Optionally specify the argument variable on the left hand side of the formula. This latter syntax should be used if there are non-argument free variables in the formula to distinguish the argument but can also be used even if not.
library(gsubfn)
fun_list2 <- list(
X2 = ~ quantile(var, c(0.1, 0.9), na.rm = TRUE),
X4 = x ~ fivenum(x, na.rm = TRUE)
)
execute <- function(funs, envir = parent.frame()) {
Map(\(fo, index) match.funfn(fo)(envir[[index]]), funs, names(fun_list))
}
# test
execute(fun_list, df) |> identical(expected)
## [1] TRUE
execute(fun_list2, df) |> identical(expected)
## [1] TRUE
Input from question:
set.seed(123)
df <- data.frame(matrix(rnorm(500), 100, 5))
fun_list <- list(
X2 = ~ quantile(.x, c(0.1, 0.9), na.rm = TRUE),
X4 = ~ fivenum(.x, na.rm = TRUE)
)
Upvotes: 3
Reputation: 35554
A workaround is to use a nested map
, which can take a purrr-style formula as input and avoid the use of rlang::as_function()
.
library(purrr)
imap(fun_list, \(f, var) map(df[var], f)[[1]])
# $X2
# 10% 90%
# -1.289408 1.058432
#
# $X4
# [1] -2.465898194 -0.737146704 -0.003508661 0.693634712 2.571458146
or briefly, imap(fun_list, ~ map(df[.y], .x)[[1]])
.
Upvotes: 2