Dr. Beeblebrox
Dr. Beeblebrox

Reputation: 848

Why does Ifelse fail to replace NAs?

I have a dataset where one column contains entries of yes, no, and NA. I want to replace any NA with 1, and replace any non-NA entry with 0. Ifelse replaces the non-NA entries with 0, but does NOT replace the NA entries with 1. I need to use the is.na() command for that. Why does is.na() work where ifelse does not?

I define a reproducible example below that starts with the column defined as a factor since that's how I got the data.

    q <-as.factor(c(NA, "yes",  "no",   "yes", NA))

    ## Does not work
    q <- ifelse(q == "NA", 1, 0)
q    
### Returns: [1] NA  0  0  0 NA

    ## Does not work
    q[q == "NA"] <- 1
q    
### Returns: [1] NA  0  0  0 NA    

    ## This works
    q[is.na(q)] <- 1
q
### Returns: [1] 1 0 0 0 1

Some other entries exist, but they do not seem to have this precise problem. https://stackoverflow.com/a/8166616/1364839 -- This answer shows that is.na() works but not why ifelse fails.

Upvotes: 2

Views: 2469

Answers (1)

Gavin Simpson
Gavin Simpson

Reputation: 174778

You really don't need ifelse() here, not least because if you don't know the value of something (which is what NA indicates!) how can you compare its value with something else?

> NA == NA ## yes, even NA can't be compared with itself
[1] NA

Instead, use is.na() to identify whether something is NA or not. is.na() returns TRUE if an element is NA and FALSE otherwise. Then we can use the fact that FALSE == 0 and TRUE == 1 when we coerce to numeric:

q <-as.factor(c(NA, "yes",  "no",   "yes", NA))
q

as.numeric(is.na(q))

> as.numeric(is.na(q))
[1] 1 0 0 0 1

If that is too much typing then

> is.na(q) + 0
[1] 1 0 0 0 1

works via the same trick except + is doing the coercion for you.

Upvotes: 4

Related Questions