Reputation: 4008
Please excuse the poor question title... I could not think how to ask it in a better way.
I think the code speaks for itself, but just to labour the point, I 'searched' for a coded value in a data frame, replaced them all with NA, but on checking if they were gone I got a surprising result (to me).
> df[df==-999.25]
[1] "-999.25000000" "-999.25000000" "-999.25000000" "-999.25000000" "-999.25000000"
[6] "-999.25000000" "-999.25000000" "-999.25000000" "-999.25000000" "-999.25000000"
[11] "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02"
[16] "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02"
[21] "-9.992500e+02" "-9.992500e+02" "-999.25000000" "-999.25000000" "-999.25000000"
[26] "-999.25000000" "-999.25000000" "-999.25000000" "-999.25000000" "-999.25000000"
[31] "-999.25000000" "-999.25000000" "-999.25000000" "-999.25000000" "-999.25000000"
[36] "-999.25000000" "-999.25000000" "-999.25000000" "-9.992500e+02" "-9.992500e+02"
[41] "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02"
[46] "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02"
[51] "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02"
[56] "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02"
[61] "-9.992500e+02" "-9.992500e+02" "-9.992500e+02" "-9.992500e+02"
> df[df==-999.25] <- NA
> df[df==-999.25]
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[30] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[59] NA NA NA NA NA NA
I am confused by this. What is the reason for it? (I am also tired, perhaps I should have sat on this for a day or two). I checked the help for '<-' and '[', but learnt nothing (I could not follow all of it).
Upvotes: 0
Views: 67
Reputation: 712
When you assign NA
to an object it means that you do not know what is inside that object, missing values, so logical statement and arithmetic calculation as * + < == > can not be applied on the missing value and R reacts by returning NA
for this cases:
a <- NA
a * 0
[1] NA
a/0
[1] NA
a<0
[1] NA
a == 0
[1] NA
Finally, I guess that you expected the result to be False
instead of NA
in df[df==-999.25]
, but how R can make inference on your logical statement when R have no idea about the Not Available or Missing Data
Upvotes: 1
Reputation: 3525
NA's are always returned when you use ==
if they are present because the result of comparing NA to whatever is NA which is returned by default. if you want them gone then you need to add & !is.na(df)
for example
test <- c(NA,5,3,2,34,"Bob")
test[test == "Bob"]
[1] NA "Bob"
because
test == "Bob"
[1] NA FALSE FALSE FALSE FALSE TRUE
Upvotes: 3