Reputation: 25
I am trying to create a new variable based on other variables in my dataset. I have used ifelse to do similar tasks, however when I have tried it this time I am getting NA when the condition is not met.
The variables I want to base the new variable on are all binary - "Yes" or "No". I want to the new variable to be coded "Yes" if any of the other variables are "Yes" and "No" if none of them are coded "Yes". When I run the ifelse, I get the expected number of "Yes", but what I would expect to be "No" is NA.
I have tried the following:
data$new <- ifelse(var1=="Yes" | var2=="Yes" | var3=="Yes","Yes","No")
Any help would be greatly appreciated. I have changed the names of the data and put three variables in the example. There are actually 22 variables in total with very similar names, I tried to simplify the example. If seeing the actual data / code would be helpful I'll add this in.
Thanks!
Upvotes: 1
Views: 554
Reputation: 36
try complete.cases() to omit NA's while creating the variable
df$new <- ifelse((var1 =="Yes"| var2=="Yes"|var3=="Yes")& complete.cases(df), "Yes", "No")
Upvotes: 0
Reputation: 887088
The ==
returns NA
if there is any NA. An option would be to cbind
the variables 'var1', 'var2', 'var3' (not clear if it is a data.frame column or independent vectors), compare with "Yes"
, to create a logical matrix, use rowSums
to get the sum of 'Yes' strings. Note the na.rm = TRUE
which would take care of the NA
elements (if any). Using the row sums, check the value is greater than 0, then it is "Yes" or else "No"
ifelse(rowSums(cbind(var1, var2, var3)== "Yes"), na.rm = TRUE) > 0, "Yes", "No")
To check why it is producing NA
v1 <- c("Yes", "No", NA)
v2 <- c("No", NA, "Yes")
(v1 == "Yes")|(v2 == "Yes")
#[1] TRUE NA TRUE
Upvotes: 5