Fill a new column based on whether two other cols are filled or empty

Question

Here are some example data to illustrate my problem

dput(example)
structure(list(ID1 = c("", "abj6498", "", "", "", "", "", "", 
"avl5803", "avl3030", "vz30775", "", "", "s883612", "avl7884", 
"", "", "avl7905", "", "", "", "", "ty68281", "", "", "z101246", 
"", "", "", "", "afh2697", "", "", "", "", "avl3037", "", "", 
"", "tw77193", "", "", "tw77053", "avl6702", "", "abj6256", "", 
"", "", ""), ID2 = c("", "and7457", "", "", "", "", "", "", "", 
"j215736", "vz33290", "", "", "afh4845", "", "", "", "j215809", 
"", "", "", "", "ty69088", "", "", "j215813", "", "", "avl6192", 
"", "j215807", "", "avl5585", "", "", "avl5441", "", "", "", 
"ty70093", "", "", "tr43355", "avl3013", "", "afh3143", "", "", 
"", "")), row.names = c(NA, -50L), class = "data.frame")

I have the dataframe example which has two cols, ID1 and ID2 all I need is a new column present based on whether or not either or both of ID1 and ID2 contain values

e.g. If ID1 is full but ID2 is not then the new column present should have the value 1 and vice versa, if both contain a value then present would contain 2

I have tried various methods using tidyverse::mutate and if statements but can't quite get the logic right.

TarJae · Accepted Answer

We could replace empty with NA and check with an case_when statement:

library(dplyr)
df %>%
  mutate(across(everything(), ~ifelse(.=="", NA, as.character(.)))) %>% 
  mutate(present = case_when(!is.na(ID1) & !is.na(ID2) ~ 2,
                             !is.na(ID1) | !is.na(ID2) ~ 1))

output:

   ID1     ID2     present
           
 1 NA      NA           NA
 2 abj6498 and7457       2
 3 NA      NA           NA
 4 NA      NA           NA
 5 NA      NA           NA
 6 NA      NA           NA
 7 NA      NA           NA
 8 NA      NA           NA
 9 avl5803 NA            1
10 avl3030 j215736       2
# … with 40 more rows

Fill a new column based on whether two other cols are filled or empty

Answers (2)

Related Questions