Reputation: 267
Hi I am trying to create a data.table with lagged variables by group id. Certain id's have only 1 row in the data.table in that case the shift operator for lag gives error but the lead operator works fine. Here is an example
dt = data.table(id = 1, week = as.Date('2014-11-11'), sales = 1)
lead = 2
lag = 2
lagSalesNames = paste('lag_sales_', 1:lag, sep = '')
dt[,(lagSalesNames) := shift(sales, 1:lag, NA, 'lag'), by = list(id)]
This gives me the following error
All items in j=list(...) should be atomic vectors or lists. If you are trying something like j=list(.SD,newcol=mean(colA)) then use := by group instead
(much quicker), or cbind or merge afterwards.
But if I try the same thing with lead instead, it works fine
dt[,(lagSalesNames) := shift(sales, 1:lag, NA, 'lead'), by = list(id)]
It also seem to work fine if the data.table has more than 1 row e.g. you can try the following with 2 rows which works fine
dt = data.table(id = 1, week = as.Date(c('2014-11-11', '2014-11-11')), sales = 1:2)
dt[,(lagSalesNames) := shift(sales, 1:lag, NA, 'lag'), by = list(id)]
I am using data.table version 1.9.5 on a linux machine with R version 3.1.0. Any help would be much appreciated.
Thanks, Ashin
Upvotes: 3
Views: 282
Reputation: 118889
Thanks for the report. This is now fixed (issue #1014) with commit #1722 in data.table v1.9.5.
Now works as intended:
dt
# id week sales lag_sales_1 lag_sales_2
# 1: 1 2014-11-11 1 NA NA
Upvotes: 2