Jo Lo
Jo Lo

Reputation: 51

Applying case_when to dataframe based on list of functions in another dataframe

I am trying to apply case_when to selected columns in dataframe_1 based on a list of case_when functions in another dataframe.

For example, using the mtcars dataframe:

mpg cyl
21 6
21 6
22.8 4

I would like to use the below dataframe (dataframe_2) to mutate the columns.

dataframe_2 = data.frame(variable = c(c("mpg", "cyl")), function_to_use = c('case_when(mpg == 21 ~ "A", mpg == 22.8 ~ "B"', 'case_when(cyl == 6 ~ "X", cyl == 4 ~ "Y"'))

variable function_to_use
mpg case_when(mpg == 21 ~ "A", mpg == 22.8 ~ "B"
cyl case_when(cyl == 6 ~ "X", cyl == 4 ~ "Y"

The expected outcome would be:

mpg cyl
A X
A X
B Y

I would like to accomplish this dynamically using dataframe_2.

I have tried the below code.

mtcars %>% mutate(across(dataframe_2$variable ~ !!!parse_exprs(dataframe_2$function_to_use)))

I have noticed that the below code works, but does not change the original columns.

a <- ‘case_when(mpg == 21 ~ "A", mpg == 22.8 ~ "B"; case_when(cyl == 6 ~ "X", cyl == 4 ~ "Y"‘
mtcars %>% mutate(across(c("mpg", "cyl") ~ !!!parse_exprs(a)))

Edit: extension to the problem.

My dataframes are actually dataframes in a list:

dataframe_1 <- list(dataframe_1a = mtcars[1:3, c("mpg", "cyl")], dataframe_1b = mtcars[4:6, c("mpg", "cyl")])
dataframe_2 <- list(dataframe_2a = data.frame(variable = c(c("mpg", "cyl")), function_to_use = c('case_when(mpg == 21 ~ "A", mpg == 22.8 ~ "B"', 
                                             'case_when(cyl == 6 ~ "X", cyl == 4 ~ "Y"')), 
dataframe_2b = data.frame(variable = c(c("mpg", "cyl")), function_to_use = c('case_when(mpg > 21 ~ "A", TRUE ~ "B"',
'case_when(cyl >7 ~ "X", TRUE ~ 4 ~ "Y"'))

How would I apply each function to their respective dataframe pairs? I've tried using map2 but with no success. Thank you for your help!

Upvotes: 1

Views: 95

Answers (2)

stefan
stefan

Reputation: 123768

Using purrr::reduce2 you could loop over the rows of dataframe_2 like so:

Note: Your case_when statements missed the closing parenthesis.

library(purrr)
library(dplyr, warn = FALSE)

dataframe_1 <- mtcars[1:3, c("mpg", "cyl")]

dataframe_2 <- data.frame(
  variable = c(c("mpg", "cyl")),
  function_to_use = c(
    'case_when(mpg == 21 ~ "A", mpg == 22.8 ~ "B")',
    'case_when(cyl == 6 ~ "X", cyl == 4 ~ "Y")'
  )
)

dataframe_1 |>
  purrr::reduce2(
    dataframe_2$variable, dataframe_2$function_to_use,
    \(x, y, z) mutate(x, across(all_of(y), ~ !!!rlang::parse_exprs(z))),
    .init = _
  )
#>               mpg cyl
#> Mazda RX4       A   X
#> Mazda RX4 Wag   A   X
#> Datsun 710      B   Y

EDIT For your more general problem you could wrap the code in a function, then use purrr::map2 to loop over the lists of data frames:

library(purrr)
library(dplyr, warn = FALSE)
library(rlang)

myfun <- function(x, y) {
  x |>
    purrr::reduce2(
      y$variable, y$function_to_use,
      \(x, y, z) mutate(x, across(all_of(y), ~ !!!rlang::parse_exprs(z))),
      .init = _
    )
}

purrr::map2(dataframe_1, dataframe_2, myfun)
#> $dataframe_1a
#>               mpg cyl
#> Mazda RX4       A   X
#> Mazda RX4 Wag   A   X
#> Datsun 710      B   Y
#> 
#> $dataframe_1b
#>                   mpg cyl
#> Hornet 4 Drive      A   Y
#> Hornet Sportabout   B   X
#> Valiant             B   Y

Upvotes: 1

Nir Graham
Nir Graham

Reputation: 5137

I offer a purely dplyr solution ; first I change the function_to_use statements to assume some suitably anonymous input as the precise input is coming from the associated variable column in the frame; nothing is lost, but the function is suitably easy to satisfy.

library(dplyr, warn = FALSE)

dataframe_1 <- mtcars[1:3, c("mpg", "cyl")]

dataframe_2 <- data.frame(
  variable = c(c("mpg", "cyl")),
  function_to_use = c(
    'case_when(x == 21 ~ "A", x == 22.8 ~ "B")',
    'case_when(x == 6 ~ "X", x == 4 ~ "Y")'
  )
)

my_eval_func <- \(x){
  ftu <- filter(
    dataframe_2,
    variable == cur_column()
  ) |> pull(function_to_use)
  eval(parse(text = ftu))
}

mutate(dataframe_1, across(c("mpg", "cyl"), 
                            .fns = my_eval_func))

#              mpg cyl
# Mazda RX4       A   X
# Mazda RX4 Wag   A   X
# Datsun 710      B   Y

Upvotes: 0

Related Questions