Reputation: 97
My problem seemed really easy but can't figure out an easy solution. I have a value for all categorical variables in my dataset which is "missing". In order join results later on with a function of myown I need this value to be unique so what I want, is to change the value "missing" by "missing (var_name)".
I first tried something like :
data %>% mutate(across(where(is.character),
~ replace(., . == "missing", paste("missing", SOMETHING(.)))))
This doesn't quite work since I miss this SOMETHING
function to access the column name throughout the across
statement just using the "." parameter...
The other solution I tried is using
purrr:imap(data %>% select(where(is.character)),
~ replace(.x, .x == "missing", paste("missing", .y))))
This is close to what I want but then I have trouble reinserting easily and computationnaly effeciently the purrr:imap
output into my initial dataframe instead of the initial character columns.
I think I need some break and/or some help to see clearer because I am kind of tired fighting with something which appear to be so simple...
I would rather use the dplyr
solution but the purrr
one is ok. Actually, whatever works fine and quick (just so you know, I have more than 600 cols et 150,000 rows)
Any help or advice is welcome !
Thanks
Upvotes: 2
Views: 381
Reputation: 35594
Example Data
df <- data.frame(var.X = c("a", "missing", "a"),
var.Y = c("b", "b", "missing"),
var.Z = c("missing", "missing", "c"))
# var.X var.Y var.Z
# 1 a b missing
# 2 missing b missing
# 3 a missing c
By dplyr
, you can use cur_column()
in across()
. From ?context
:
cur_column() gives the name of the current column (in across() only).
library(dplyr)
df %>%
mutate(across(where(is.character),
~ recode(.x, missing = paste0("missing(", cur_column(), ")"))))
# var.X var.Y var.Z
# 1 a b missing(var.Z)
# 2 missing(var.X) b missing(var.Z)
# 3 a missing(var.Y) c
or
df %>%
mutate(across(where(is.character),
~ recode(.x, missing = sprintf("missing(%s)", cur_column()))))
Upvotes: 2