Ashin Mukherjee
Ashin Mukherjee

Reputation: 267

data.table lag operator throwing error

Hi I am trying to create a data.table with lagged variables by group id. Certain id's have only 1 row in the data.table in that case the shift operator for lag gives error but the lead operator works fine. Here is an example

dt = data.table(id = 1, week = as.Date('2014-11-11'), sales = 1)
lead = 2
lag = 2
lagSalesNames = paste('lag_sales_', 1:lag, sep = '')
dt[,(lagSalesNames) := shift(sales, 1:lag, NA, 'lag'), by = list(id)]

This gives me the following error

All items in j=list(...) should be atomic vectors or lists. If you are trying something like j=list(.SD,newcol=mean(colA)) then use := by group instead
 (much quicker), or cbind or merge afterwards.

But if I try the same thing with lead instead, it works fine

dt[,(lagSalesNames) := shift(sales, 1:lag, NA, 'lead'), by = list(id)]

It also seem to work fine if the data.table has more than 1 row e.g. you can try the following with 2 rows which works fine

dt = data.table(id = 1, week = as.Date(c('2014-11-11', '2014-11-11')), sales = 1:2)
dt[,(lagSalesNames) := shift(sales, 1:lag, NA, 'lag'), by = list(id)]

I am using data.table version 1.9.5 on a linux machine with R version 3.1.0. Any help would be much appreciated.

Thanks, Ashin

Upvotes: 3

Views: 282

Answers (1)

Arun
Arun

Reputation: 118889

Thanks for the report. This is now fixed (issue #1014) with commit #1722 in data.table v1.9.5.

Now works as intended:

dt
#    id       week sales lag_sales_1 lag_sales_2
# 1:  1 2014-11-11     1          NA          NA

Upvotes: 2

Related Questions