Anshu Chen
Anshu Chen

Reputation: 433

Using data.table to select rows by distance from another row

Let's say I have the following data.table.

library(data.table)
DT <- data.table(x=1:6, y=c(0,0,1,0,0,0))

Could I write some command DT[...] that selects all the rows within 2 rows of the one in which y=1? That is, using proximity to row three, I want to select rows 1-5.

Upvotes: 2

Views: 92

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389235

We can use rolling operations with different alignment to find if there is any value in y which has 1 in it with a window size of 3.

library(data.table)
library(zoo)

DT[rollapplyr(y == 1, 3, any, fill = FALSE) | 
     rollapply(y == 1, 3, any, fill = FALSE, align = 'left')]

#   x y
#1: 1 0
#2: 2 0
#3: 3 1
#4: 4 0
#5: 5 0

rollapplyr is same as rollapply(...., align = 'right')

Upvotes: 2

akrun
akrun

Reputation: 887851

Here is one option to loop over the position index (which(y == 1)) with sapply, create a sequence by adding/subtracting 2 to it, get the unique elements (in case of overlaps) and subset the rows by using that i

library(data.table)
DT[unique(sapply(which(y==1), function(i) (i-2):(i + 2)))]

-output

#   x y
#1: 1 0
#2: 2 0
#3: 3 1
#4: 4 0
#5: 5 0

If there are negative index, we can subset those

i1 <- DT[,unique(sapply(which(y==1), function(i) (i-2):(i + 2)))][,1]
DT[i1[i1 > 0]]

Upvotes: 2

Related Questions