Reputation: 51
I am trying to apply case_when to selected columns in dataframe_1 based on a list of case_when functions in another dataframe.
For example, using the mtcars dataframe:
mpg | cyl |
---|---|
21 | 6 |
21 | 6 |
22.8 | 4 |
I would like to use the below dataframe (dataframe_2) to mutate the columns.
dataframe_2 = data.frame(variable = c(c("mpg", "cyl")), function_to_use = c('case_when(mpg == 21 ~ "A", mpg == 22.8 ~ "B"', 'case_when(cyl == 6 ~ "X", cyl == 4 ~ "Y"'))
variable | function_to_use |
---|---|
mpg | case_when(mpg == 21 ~ "A", mpg == 22.8 ~ "B" |
cyl | case_when(cyl == 6 ~ "X", cyl == 4 ~ "Y" |
The expected outcome would be:
mpg | cyl |
---|---|
A | X |
A | X |
B | Y |
I would like to accomplish this dynamically using dataframe_2.
I have tried the below code.
mtcars %>% mutate(across(dataframe_2$variable ~ !!!parse_exprs(dataframe_2$function_to_use)))
I have noticed that the below code works, but does not change the original columns.
a <- ‘case_when(mpg == 21 ~ "A", mpg == 22.8 ~ "B"; case_when(cyl == 6 ~ "X", cyl == 4 ~ "Y"‘
mtcars %>% mutate(across(c("mpg", "cyl") ~ !!!parse_exprs(a)))
Edit: extension to the problem.
My dataframes are actually dataframes in a list:
dataframe_1 <- list(dataframe_1a = mtcars[1:3, c("mpg", "cyl")], dataframe_1b = mtcars[4:6, c("mpg", "cyl")])
dataframe_2 <- list(dataframe_2a = data.frame(variable = c(c("mpg", "cyl")), function_to_use = c('case_when(mpg == 21 ~ "A", mpg == 22.8 ~ "B"',
'case_when(cyl == 6 ~ "X", cyl == 4 ~ "Y"')),
dataframe_2b = data.frame(variable = c(c("mpg", "cyl")), function_to_use = c('case_when(mpg > 21 ~ "A", TRUE ~ "B"',
'case_when(cyl >7 ~ "X", TRUE ~ 4 ~ "Y"'))
How would I apply each function to their respective dataframe pairs? I've tried using map2 but with no success. Thank you for your help!
Upvotes: 1
Views: 95
Reputation: 123768
Using purrr::reduce2
you could loop over the rows of dataframe_2
like so:
Note: Your case_when
statements missed the closing parenthesis.
library(purrr)
library(dplyr, warn = FALSE)
dataframe_1 <- mtcars[1:3, c("mpg", "cyl")]
dataframe_2 <- data.frame(
variable = c(c("mpg", "cyl")),
function_to_use = c(
'case_when(mpg == 21 ~ "A", mpg == 22.8 ~ "B")',
'case_when(cyl == 6 ~ "X", cyl == 4 ~ "Y")'
)
)
dataframe_1 |>
purrr::reduce2(
dataframe_2$variable, dataframe_2$function_to_use,
\(x, y, z) mutate(x, across(all_of(y), ~ !!!rlang::parse_exprs(z))),
.init = _
)
#> mpg cyl
#> Mazda RX4 A X
#> Mazda RX4 Wag A X
#> Datsun 710 B Y
EDIT For your more general problem you could wrap the code in a function, then use purrr::map2
to loop over the lists of data frames:
library(purrr)
library(dplyr, warn = FALSE)
library(rlang)
myfun <- function(x, y) {
x |>
purrr::reduce2(
y$variable, y$function_to_use,
\(x, y, z) mutate(x, across(all_of(y), ~ !!!rlang::parse_exprs(z))),
.init = _
)
}
purrr::map2(dataframe_1, dataframe_2, myfun)
#> $dataframe_1a
#> mpg cyl
#> Mazda RX4 A X
#> Mazda RX4 Wag A X
#> Datsun 710 B Y
#>
#> $dataframe_1b
#> mpg cyl
#> Hornet 4 Drive A Y
#> Hornet Sportabout B X
#> Valiant B Y
Upvotes: 1
Reputation: 5137
I offer a purely dplyr solution ; first I change the function_to_use statements to assume some suitably anonymous input as the precise input is coming from the associated variable column in the frame; nothing is lost, but the function is suitably easy to satisfy.
library(dplyr, warn = FALSE)
dataframe_1 <- mtcars[1:3, c("mpg", "cyl")]
dataframe_2 <- data.frame(
variable = c(c("mpg", "cyl")),
function_to_use = c(
'case_when(x == 21 ~ "A", x == 22.8 ~ "B")',
'case_when(x == 6 ~ "X", x == 4 ~ "Y")'
)
)
my_eval_func <- \(x){
ftu <- filter(
dataframe_2,
variable == cur_column()
) |> pull(function_to_use)
eval(parse(text = ftu))
}
mutate(dataframe_1, across(c("mpg", "cyl"),
.fns = my_eval_func))
# mpg cyl
# Mazda RX4 A X
# Mazda RX4 Wag A X
# Datsun 710 B Y
Upvotes: 0