Reputation: 763
I have a question which is giving me unexpected troubles. The question, which seems imbarassingly simple, is that I would like to subset a data.table
according to a few conditions. Let's consider this example; imagine I have a data.table
like the one below. Now, I would like to extrapolate from it a subset such that a>=1045 and a<=1180 and (b=="c" or b=="d").
dt<-data.table(a=(1000:1200), b=sample(letters[1:3], size=201, replace=T))
I read the data.table
guides and FAQ and searched the forum, but I don't seem to find anything which is relevant. I apologize in advance if you think the question is meaningless.
Upvotes: 2
Views: 136
Reputation: 34703
More data.table
-y solution (after first defining id
as as.factor(1000:1200)
), i.e. taking advantage of binary search:
setkey(dt,a,b)
dt[.(rep(paste0(1045:1180),2),rep(c("b","c"),each=136)),nomatch=0L]
(we have to specify nomatch
because otherwise we'll just return everything--it would be easier to tell the difference if you had a third variable, which would come back as NA
for unmatched key combinations)
(of course if you cast a
as numeric, we don't need the paste0
bit)
An even cleaner way of doing this (without having to specify the recycling ourselves, basically) is with the CJ
function:
setkey(dt,a,b)
dt[CJ(paste0(1045:1180),c("c","d")),nomatch=0L]
Upvotes: 1
Reputation: 887118
You can try
dt[a>=1045 & a <=1180 & b %chin% c('c', 'd')]
Or as @David Arenburg commented, between
can be used
dt[between(a, 1045, 1180) & b %chin% c('c', 'd')]
Upvotes: 3