Reputation: 1523
I am trying to use case_when
to modify/mutate a column based on two separate inputs. One that used to create the LHS logical and the respective input value on the RHS. An example is provided below.
library(dplyr)
library(purrr)
library(tibble)
df <- tibble(var = paste0(rep("var", 10), 1:10),
label = c("label1", "label2", rep(NA, 7), "label10"))
match_var <- paste0(rep("var", 7), 3:9)
new_labels <- paste0(rep("add_this_label", 7), 3:9)
df %>%
mutate(test = map2(match_var , new_labels,
~case_when(
var == .x ~ .y,
TRUE ~ label
)
))
I think the issue is that within case_when
everything is evaluated as expression but I'm not completely sure. One can manually type out all 7 lines within case_when
but my application requires me to accomplish this when the vectors match_vars
and new_labels
are very long - making manual typing of case_when
infeasible.
df %>%
mutate(label = case_when(
var == match_var[1] ~ new_labels[1],
var == match_var[2] ~ new_labels[2],
var == match_var[3] ~ new_labels[3],
var == match_var[4] ~ new_labels[4],
var == match_var[5] ~ new_labels[5],
var == match_var[6] ~ new_labels[6],
var == match_var[7] ~ new_labels[7],
TRUE ~ label
))
EDIT: the desired result can be accomplished using a for
loop but now I want to know if is this possible using case_when
and map2_*
function?
for (i in seq_along(match_var)) {
df$label <- ifelse(df$var == match_var[i], new_labels[i], df$label)
}
Upvotes: 5
Views: 4042
Reputation: 153
Since you're comparing the ==
condition, this can also be done with dplyr::recode
using a named vector (note the need for unquote splicing !!!
):
df %>%
mutate(
label = recode(
var,
!!!setNames(new_labels, match_var),
.default = label
)
)
Upvotes: 1
Reputation: 3700
You can join the new labels to the data frame and fill in with the old label as necessary.
library("tidyverse")
df <- tibble(var = paste0(rep("var", 10), 1:10),
label = c("label1", "label2", rep(NA, 7), "label10"))
match_var <- paste0(rep("var", 7), 3:9)
new_label <- paste0(rep("add_this_label", 7), 3:9)
new_labels <- tibble(match_var, new_label)
df %>%
left_join(new_labels, by = c("var" = "match_var")) %>%
mutate(new_label = if_else(is.na(new_label), label, new_label))
#> # A tibble: 10 x 3
#> var label new_label
#> <chr> <chr> <chr>
#> 1 var1 label1 label1
#> 2 var2 label2 label2
#> 3 var3 <NA> add_this_label3
#> 4 var4 <NA> add_this_label4
#> 5 var5 <NA> add_this_label5
#> 6 var6 <NA> add_this_label6
#> 7 var7 <NA> add_this_label7
#> 8 var8 <NA> add_this_label8
#> 9 var9 <NA> add_this_label9
#> 10 var10 label10 label10
Created on 2019-03-28 by the reprex package (v0.2.1)
Upvotes: 1
Reputation: 887951
We create a named vector and use that to match the the values in 'var' so as to change the NA elements to 'new_labels'
library(tibble)
library(dplyr)
df %>%
mutate(label = case_when(is.na(label) ~
deframe(tibble(match_var, new_labels))[var],
TRUE ~ label))
# A tibble: 10 x 2
# var label
# <chr> <chr>
# 1 var1 label1
# 2 var2 label2
# 3 var3 add_this_label3
# 4 var4 add_this_label4
# 5 var5 add_this_label5
# 6 var6 add_this_label6
# 7 var7 add_this_label7
# 8 var8 add_this_label8
# 9 var9 add_this_label9
#10 var10 label10
NOTE: Instead of using deframe
, the named vector can be created with setNames
as well
Upvotes: 2