chan1612
chan1612

Reputation: 25

Why is this ifelse statement producing NAs?

I am trying to create a new variable based on other variables in my dataset. I have used ifelse to do similar tasks, however when I have tried it this time I am getting NA when the condition is not met.

The variables I want to base the new variable on are all binary - "Yes" or "No". I want to the new variable to be coded "Yes" if any of the other variables are "Yes" and "No" if none of them are coded "Yes". When I run the ifelse, I get the expected number of "Yes", but what I would expect to be "No" is NA.

I have tried the following:

data$new <- ifelse(var1=="Yes" | var2=="Yes" | var3=="Yes","Yes","No")

Any help would be greatly appreciated. I have changed the names of the data and put three variables in the example. There are actually 22 variables in total with very similar names, I tried to simplify the example. If seeing the actual data / code would be helpful I'll add this in.

Thanks!

Upvotes: 1

Views: 554

Answers (2)

Gsingh
Gsingh

Reputation: 36

try complete.cases() to omit NA's while creating the variable

df$new <- ifelse((var1 =="Yes"| var2=="Yes"|var3=="Yes")& complete.cases(df), "Yes", "No")


Upvotes: 0

akrun
akrun

Reputation: 887088

The == returns NA if there is any NA. An option would be to cbind the variables 'var1', 'var2', 'var3' (not clear if it is a data.frame column or independent vectors), compare with "Yes", to create a logical matrix, use rowSums to get the sum of 'Yes' strings. Note the na.rm = TRUE which would take care of the NA elements (if any). Using the row sums, check the value is greater than 0, then it is "Yes" or else "No"

ifelse(rowSums(cbind(var1, var2, var3)== "Yes"), na.rm = TRUE) > 0, "Yes", "No")

To check why it is producing NA

v1 <- c("Yes", "No", NA)
v2 <- c("No", NA, "Yes")

(v1 == "Yes")|(v2 == "Yes")
#[1] TRUE   NA TRUE

Upvotes: 5

Related Questions