Reputation: 8021
I use data.table for reshaping my data quite heavily. However, after updating the data.table package my code is not working any more.
I basically want to extend my dataset based on two columns (start.date
and stop.date
).
Please see the toy example below:
# Set up toy data
id <- letters[1:3]
start.date <- as.Date(c("2012-01-01", "2012-01-03", "2012-01-05"))
stop.date <- as.Date(c("2012-01-03", "2012-01-06", "2012-01-06"))
d <- data.table(id, start.date, stop.date)
# This is how the input data looks like
# id start.date stop.date
# 1: a 2012-01-01 2012-01-03
# 2: b 2012-01-03 2012-01-06
# 3: c 2012-01-05 2012-01-06
# Working code with older version of data.table.
d.new <- d[, c(.SD, list(time=seq(start.date, stop.date, by="days"))), by=id]
# The result looks like that:
# id start.date stop.date V3
# 1: a 2012-01-01 2012-01-03 2012-01-01,2012-01-02,2012-01-03
# 2: b 2012-01-03 2012-01-06 2012-01-03,2012-01-04,2012-01-05,2012-01-06
# 3: c 2012-01-05 2012-01-06 2012-01-05,2012-01-06
This is how the final data should look like (and did look like before updating the data.table package)
# id start.date stop.date time
# 1: a 2012-01-01 2012-01-03 2012-01-01
# 2: a 2012-01-01 2012-01-03 2012-01-02
# 3: a 2012-01-01 2012-01-03 2012-01-03
# 4: b 2012-01-03 2012-01-06 2012-01-03
# 5: b 2012-01-03 2012-01-06 2012-01-04
# 6: b 2012-01-03 2012-01-06 2012-01-05
# 7: b 2012-01-03 2012-01-06 2012-01-06
# 8: c 2012-01-05 2012-01-06 2012-01-05
# 9: c 2012-01-05 2012-01-06 2012-01-06
Upvotes: 2
Views: 161
Reputation: 118779
Thanks for catching this one and also for filing the bug #861. This is now fixed in v1.9.5. From NEWS:
Some optimisations of
.SD
inj
was done in 1.9.4, refer to #735. Due to an oversight, j-expressions of the formc(lapply(.SD, ...), list(...))
were optimised improperly. This is now fixed. Thanks to @mmeierer for filing #861.
That is:
d.new <- d[, c(.SD, list(time=seq(start.date, stop.date, by="days"))), by=id]
will work as intended, but faster (as it is internally optimised - now correctly).
My earlier suggestion was how I thought it should work and had implemented that optimisation (which was incorrect). Now all good to go :-).
We plan to push the next release very soon with a bunch of quick high priority fixes to run smoothly.
Upvotes: 1