Reputation: 1173
As far as I know it should be avoided to use "&" and "|" in i to avoid vector scans. Therefore:
data<-data.table(a=c(NA, 1, 2), b=c(1, 2, 1), key="a,b")
data[is.na(a) & b==1]
should be replaced by
data[.(NA_integer_, 1)]
But: When I'm interesed in all non-NA entries how should I do that? Is this ok to use the following or does it use slower vector scans?
data[!is.na(a) & b==1]
because something like this does not seem to work
data[.(!NA_integer_, 1)]
Upvotes: 4
Views: 2688
Reputation: 118879
Unfortunately, it's not possible to have expressions of the form you require in binary search based subsets currently.. i.e., we can not negate on individual key columns.
The way to perform a binary search based subset at the moment would be:
require(data.table) ## v1.9.5+
a_val = setdiff(unique(data$a), NA)
setkey(data)[.(a_val, 1), nomatch=0L]
# a b
# 1: 2 1
May be it'd be nice to have a function, for example, not()
or except()
that'd allow us to extract the values internally... care to file a FR here?
Upvotes: 4