Reputation: 1345
My question will be explained throughout the following reproducible example.
First, let's load the required packages and create a POSIXct
and a data.table
object.
library(data.table)
library(lubridate)
target_date <- ymd(20180601, tz='America/Montreal')
test <- data.table(
V1 = seq(1:3),
V2 = c(ymd(20170421, tz='America/Montreal'),
ymd(20170702, tz='America/Montreal'),
ymd(20180113, tz='America/Montreal'))
)
As we can see below, everything worked fine.
test[]
## V1 V2
## 1: 1 2017-04-21
## 2: 2 2017-07-02
## 3: 3 2018-01-13
Then, I create a new column which contains Interval
objects.
test[, V3:=interval(V2, target_date)]
Still there, everything is alright.
test[]
## V1 V2 V3
## 1: 1 2017-04-21 2017-04-21 EDT--2018-06-01 EDT
## 2: 2 2017-07-02 2017-07-02 EDT--2018-06-01 EDT
## 3: 3 2018-01-13 2018-01-13 EST--2018-06-01 EDT
Then, I want to create two new columns which will only be defined on the first two rows: the first one is a function of the POSIXct
column and the second one is a function of the Interval
column.
test[c(1, 3), V4 := V2 + days(20)]
test[]
## V1 V2 V3 V4
## 1: 1 2017-04-21 2017-04-21 EDT--2018-06-01 EDT 2017-05-11
## 2: 2 2017-07-02 2017-07-02 EDT--2018-06-01 EDT <NA>
## 3: 3 2018-01-13 2018-01-13 EST--2018-06-01 EDT 2018-02-02
Still correct at this point.
Then, when I try the same thing with a column function of the Interval
column, I get the following warning
test[c(1, 3), V5 := ymd(20180101, tz='America/Montreal') %within% V3]
## Warning messages:
## 1: In as.numeric(a) - as.numeric(b@start) <= [email protected] :
## longer object length is not a multiple of shorter object length
## 2: In `[.data.table`(test, c(1, 3), `:=`(V5, ymd(20180101, tz = "America/Montreal") %within% :
## Supplied 3 items to be assigned to 2 items of column 'V5' (1 unused)
This warning
tells me that, in fact, the command produced 3 values and that I try to input those in a 2 rows data.table
(the filtered data.table
has two rows).
I tried to figure out what the problem was and I think I found a hint. Consider the two following commands:
test[, V3][2]
## [1] 2017-07-02 EDT--2018-06-01 EDT
test[2, V3]
## [1] 2017-04-21 EDT--2018-03-21 EDT 2017-07-02 EDT--2018-06-01 EDT 2018-01-13 EST--2018-12-13 EST
In fact, I thought those two commands would have produce the same result, which is not the case. Even more surprising, they both have a length
of 1 (when looking at the output above, I thought the second command would have a length
of 3).
length(test[, V3][2])
## [1] 1
length(test[2, V3])
## [1] 1
The thing is, these two commands produce an object with a vector of 1 in the .Data
slot (the length of the Interval
in seconds I guess),
test[, V3][2]@.Data
## [1] 28857600
test[2, V3]@.Data
## [1] 28857600
but the first has a vector of 1 in the start
slot while the second has a vector of 3 in the start
slot.
test[, V3][2]@start
## [1] "2017-07-02 EDT"
test[2, V3]@start
## [1] "2017-04-21 EDT" "2017-07-02 EDT" "2018-01-13 EST"
I know I could probably fix this problem by encapsulating each Interval
objects in a list and extract them every time I need, but is there another way to deal with this?
Upvotes: 5
Views: 395
Reputation: 588
Might be a bit filthy, but could you just go:
test[c(1, 3), V5 := ymd(20180101, tz='America/Montreal') %within% V3[.I]]
test[, dat := (V3[.I]@.Data)]
Upvotes: 2