Reputation: 848
I have a dataset where one column contains entries of yes
, no
, and NA
. I want to replace any NA
with 1
, and replace any non-NA
entry with 0
. Ifelse
replaces the non-NA
entries with 0
, but does NOT replace the NA
entries with 1
. I need to use the is.na()
command for that. Why does is.na()
work where ifelse
does not?
I define a reproducible example below that starts with the column defined as a factor since that's how I got the data.
q <-as.factor(c(NA, "yes", "no", "yes", NA))
## Does not work
q <- ifelse(q == "NA", 1, 0)
q
### Returns: [1] NA 0 0 0 NA
## Does not work
q[q == "NA"] <- 1
q
### Returns: [1] NA 0 0 0 NA
## This works
q[is.na(q)] <- 1
q
### Returns: [1] 1 0 0 0 1
Some other entries exist, but they do not seem to have this precise problem.
https://stackoverflow.com/a/8166616/1364839 -- This answer shows that is.na()
works but not why ifelse
fails.
Upvotes: 2
Views: 2469
Reputation: 174778
You really don't need ifelse()
here, not least because if you don't know the value of something (which is what NA
indicates!) how can you compare its value with something else?
> NA == NA ## yes, even NA can't be compared with itself
[1] NA
Instead, use is.na()
to identify whether something is NA
or not. is.na()
returns TRUE
if an element is NA
and FALSE
otherwise. Then we can use the fact that FALSE == 0
and TRUE == 1
when we coerce to numeric:
q <-as.factor(c(NA, "yes", "no", "yes", NA))
q
as.numeric(is.na(q))
> as.numeric(is.na(q))
[1] 1 0 0 0 1
If that is too much typing then
> is.na(q) + 0
[1] 1 0 0 0 1
works via the same trick except +
is doing the coercion for you.
Upvotes: 4