Shift in datatable evaluation

Question

I was trying to use shift in datatable as follows:

DT <- data.table(a = c(1,2,3,4,5,6,7), b = c("A","A","A","B","B","B","B"))
DT[!is.na(shift(a, 1)), xy := shift(a, 1), by = b]

This was just a test to demonstrate the issue: in the second row, xy is NA. The reason for that is that shift in the evaluation i part (as part of: DT[i,j,by]) does not account for the by-keyword, whereas j does. This seems confusing to me. Is there a reason for this behavior?

Jaap · Accepted Answer

The reason you are getting a NA-value in the second row is because you are filtering out the first row with !is.na(shift(a,1)), see:

> DT[, !is.na(shift(a, n = 1))]
[1] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

This is not included in the by-operation because the filtering in i is done before the grouping.

As a result xy := shift(a, 1) will look at only the rows 2 to 7.

Shift in datatable evaluation

Answers (1)

Related Questions