Reputation: 1004
I want to compare two vectors elementwise to check whether an element in a certain position in the first vector is different from the element in the same position in the second vector.
The point is that I have NA
values inside the vectors, and when doing the comparison for these values I get NA
instead of TRUE
or FALSE
.
Reproducible example:
Here is what I get:
a<-c(1, NA, 2, 2, NA)
b<-c(1, 1, 1, NA, NA)
a!=b
[1] FALSE TRUE NA NA NA
Here is how I would like the !=
operator to work (treat NA
values as if they were just another "level" of the variable):
a!=b
[1] FALSE TRUE TRUE TRUE FALSE
There's a possible solution at this link, but the guy is creating a function to perform the task. I was wondering if there's a more elegant way to do that.
Upvotes: 13
Views: 8358
Reputation: 11
I'm not sure about it being the most elegant, but
paste(a) != paste(b)
(convert all elements of both vectors to strings)
Has the desired output, and is simpler, than most of the answers.
Upvotes: 1
Reputation: 59475
I like this one, since it is pretty simple and it's easy to see that it works (source):
# This function returns TRUE wherever elements are the same, including NA's,
# and FALSE everywhere else.
compareNA <- function(v1, v2)
{
same <- (v1 == v2) | (is.na(v1) & is.na(v2))
same[is.na(same)] <- FALSE
return(same)
}
Upvotes: 7
Reputation: 1809
Here is another solution. It's probably slower than my other answer because it's not vectorised, but it's certainly more elegant. I noticed the other day that %in%
compares NA
like other values. Thus c(1L, NA) %in% 1:4
gives TRUE FALSE
rather than TRUE NA
, for example.
So you can have:
!mapply(`%in%`, a, b)
Upvotes: 4
Reputation: 887088
We could perform an on-the-fly replacement of the NA values with a value v1
which is not present in both the vectors and do the !=
f1 <- function(x, y) {
v1 <- setdiff(1:1000, na.omit(unique(c(x,y))))[1]
replace(x, is.na(x), v1) != replace(y, is.na(y), v1)
}
f1(a,b)
#[1] FALSE TRUE TRUE TRUE FALSE
f1(a1,b1)
#[1] TRUE TRUE TRUE
f1(a2,b2)
#[1] FALSE TRUE TRUE FALSE
a <- c(1, NA, 2, 2, NA)
b<-c(1, 1, 1, NA, NA)
a1 <- c(NA, 1, NA)
b1 <- c(2, NA, 3)
a2<-c(1,NA,2,NA)
b2<-c(1,1,3,NA)
Upvotes: 1
Reputation: 1809
Taking advantage of the fact that:
T & NA = NA
but
F & NA = F
and
F | NA = NA
but
T | NA = T
The following solution works, with carefully placed brackets:
(a != b | (is.na(a) & !is.na(b)) | (is.na(b) & !is.na(a))) & !(is.na(a) & is.na(b))
You could define:
`%!=na%` <- function(e1, e2) (e1 != e2 | (is.na(e1) & !is.na(e2)) | (is.na(e2) & !is.na(e1))) & !(is.na(e1) & is.na(e2))
and then use:
a %!=na% b
Upvotes: 14