EJJ
EJJ

Reputation: 1523

Using case_when with multiple vectors

I am trying to use case_when to modify/mutate a column based on two separate inputs. One that used to create the LHS logical and the respective input value on the RHS. An example is provided below.

library(dplyr)
library(purrr)
library(tibble)

df <- tibble(var = paste0(rep("var", 10), 1:10),
                 label = c("label1", "label2", rep(NA, 7), "label10"))

match_var <- paste0(rep("var", 7), 3:9)
new_labels <- paste0(rep("add_this_label", 7), 3:9)

df %>% 
  mutate(test = map2(match_var , new_labels,
                     ~case_when(
                       var == .x ~ .y,
                       TRUE ~ label
                     )
  ))

I think the issue is that within case_when everything is evaluated as expression but I'm not completely sure. One can manually type out all 7 lines within case_when but my application requires me to accomplish this when the vectors match_vars and new_labels are very long - making manual typing of case_when infeasible.

df %>% 
  mutate(label = case_when(
    var == match_var[1] ~ new_labels[1],
    var == match_var[2] ~ new_labels[2],
    var == match_var[3] ~ new_labels[3],
    var == match_var[4] ~ new_labels[4],
    var == match_var[5] ~ new_labels[5],
    var == match_var[6] ~ new_labels[6],
    var == match_var[7] ~ new_labels[7],
    TRUE ~ label
  ))

EDIT: the desired result can be accomplished using a for loop but now I want to know if is this possible using case_when and map2_* function?

for (i in seq_along(match_var)) {
  df$label <- ifelse(df$var == match_var[i], new_labels[i], df$label)
}

Upvotes: 5

Views: 4042

Answers (3)

Allen Baron
Allen Baron

Reputation: 153

Since you're comparing the == condition, this can also be done with dplyr::recode using a named vector (note the need for unquote splicing !!!):

df %>%
    mutate(
        label = recode(
            var,
            !!!setNames(new_labels, match_var),
            .default = label
        )
    )

Upvotes: 1

dipetkov
dipetkov

Reputation: 3700

You can join the new labels to the data frame and fill in with the old label as necessary.

library("tidyverse")

df <- tibble(var = paste0(rep("var", 10), 1:10),
             label = c("label1", "label2", rep(NA, 7), "label10"))

match_var <- paste0(rep("var", 7), 3:9)
new_label <- paste0(rep("add_this_label", 7), 3:9)

new_labels <-  tibble(match_var, new_label)

df %>%
  left_join(new_labels, by = c("var" = "match_var")) %>%
  mutate(new_label = if_else(is.na(new_label), label, new_label))
#> # A tibble: 10 x 3
#>    var   label   new_label      
#>    <chr> <chr>   <chr>          
#>  1 var1  label1  label1         
#>  2 var2  label2  label2         
#>  3 var3  <NA>    add_this_label3
#>  4 var4  <NA>    add_this_label4
#>  5 var5  <NA>    add_this_label5
#>  6 var6  <NA>    add_this_label6
#>  7 var7  <NA>    add_this_label7
#>  8 var8  <NA>    add_this_label8
#>  9 var9  <NA>    add_this_label9
#> 10 var10 label10 label10

Created on 2019-03-28 by the reprex package (v0.2.1)

Upvotes: 1

akrun
akrun

Reputation: 887951

We create a named vector and use that to match the the values in 'var' so as to change the NA elements to 'new_labels'

library(tibble)
library(dplyr)
df %>%
    mutate(label = case_when(is.na(label) ~ 
                       deframe(tibble(match_var, new_labels))[var], 
         TRUE ~ label))
# A tibble: 10 x 2
#   var   label          
#   <chr> <chr>          
# 1 var1  label1         
# 2 var2  label2         
# 3 var3  add_this_label3
# 4 var4  add_this_label4
# 5 var5  add_this_label5
# 6 var6  add_this_label6
# 7 var7  add_this_label7
# 8 var8  add_this_label8
# 9 var9  add_this_label9
#10 var10 label10        

NOTE: Instead of using deframe, the named vector can be created with setNames as well

Upvotes: 2

Related Questions