Refering to column names inside dplyr's across()

Question

Is it possible to refer to column names in a lambda function inside across()?

df <- tibble(age = c(12, 45), sex = c('f', 'f'))
allowed_values <- list(age = 18:100, sex = c("f", "m"))

df %>%
  mutate(across(c(age, sex),
                c(valid = ~ .x %in% allowed_values[[COLNAME]])))

I just came across this question where OP asks about validating columns in a dataframe based on a list of allowed values.

dplyr just gained across() and it seems like a natural choice, but we need columns names to look up the allowed values.

The best I could come up with was a call to imap_dfr, but it is more cumbersome to integrate into an anlysis pipeline, because the results need to be re-combined with the original dataframe.

s_pike · Accepted Answer

The answer is yes, you can refer to column names in dplyr's across. You need to use cur_column(). Your original answer was so close! Insert cur_column() into your solution where you want the column name:

library(tidyverse)

df <- tibble(age = c(12, 45), sex = c('f', 'f'))
allowed_values <- list(age = 18:100, sex = c("f", "m"))

df %>%
  mutate(across(c(age, sex),
                c(valid = ~ .x %in% allowed_values[[cur_column()]])
                )
         )

Reference: https://dplyr.tidyverse.org/articles/colwise.html#current-column

Refering to column names inside dplyr's across()

Answers (2)

Related Questions

Refering to column names inside dplyr&#39;s across()

Answers (2)

Related Questions

Refering to column names inside dplyr's across()