Reputation: 300
I have very big two data.table: A and B. The code below works fine, but it is very slow.
temp2=ifelse(is.na(A) & is.na(B),FALSE,
ifelse(!is.na(A) & is.na(V),TRUE,
ifelse(is.na(A) & !is.na(B),FALSE,
ifelse(A!=B,TRUE,FALSE))))
is there any better alternative so the code will run faster?
Upvotes: 2
Views: 635
Reputation: 908
Since all you need is "true" or "false" returned, it does not seem like you need to use ifelse at all.
If I am reading this correctly (and assuming you meant B not V), then whenever A
is NA
, you want false
returned, regardless of the value of B. Thus, in order for true
to be returned, A
must not be NA
. Next, in order for true
to be returned, A cannot equal B. But, if B
is NA
, NA
will be returned from testing A != B
. And, if B
is NA
, but A
is not, you want TRUE
, so:
temp2 = (!is.na(A))&((A!=B)|is.na(B))
Should do the trick. If you did mean V, then you have three data.tables?
Concerning timing,
require(data.table)
A<- data.table(v1=sample(c(1,2,NA),1e6,replace=TRUE),v2=sample(c(1,2,NA),1e6,replace=TRUE))
B<- data.table(v1=sample(c(1,2,NA),1e6,replace=TRUE),v2=sample(c(1,2,NA),1e6,replace=TRUE))
system.time({temp1 = (!is.na(A))&((A!=B)|(is.na(B)))})
## user system elapsed
## 0.41 0.00 0.41
system.time({temp2 =ifelse(is.na(A) & is.na(B),FALSE,
ifelse(!is.na(A) & is.na(B),TRUE,
ifelse(is.na(A) & !is.na(B),FALSE,
ifelse(A!=B,TRUE,FALSE))))})
## user system elapsed
## 2.56 0.11 2.68
all.equal(temp1,temp2)
## true
So, its about 6 times faster.
Upvotes: 1