Soheil
Soheil

Reputation: 974

How to lag a time variable and keep the format?

I want to lag the time variable itself, the format changes, simple example:

data<-data.frame(number=seq(1:5), 
      datetime=seq(as.POSIXct("2015/06/12"),as.POSIXct("2015/06/16"),by="1 day"))

   number   datetime
1      1 2015-06-12
2      2 2015-06-13
3      3 2015-06-14
4      4 2015-06-15
5      5 2015-06-16

What I want:

  number   datetime datetime.lag
1      1 2015-06-12           NA
2      2 2015-06-13   2015-06-12
3      3 2015-06-14   2015-06-13
4      4 2015-06-15   2015-06-14
5      5 2015-06-16   2015-06-15


data$datetime.lag<-c(NA, head(data$datetime, -1))

What I get:

  number   datetime datetime.lag
1      1 2015-06-12           NA
2      2 2015-06-13   1434092400
3      3 2015-06-14   1434178800
4      4 2015-06-15   1434265200
5      5 2015-06-16   1434351600

Why the format changes? any better suggestion?

Upvotes: 2

Views: 1852

Answers (3)

thelatemail
thelatemail

Reputation: 93938

Your date is being coerced because NA isn't recognised as the same class as POSIXct. Try:

data$datetime.lag <- c(as.POSIXct(NA), head(data$datetime, -1))

Upvotes: 2

akrun
akrun

Reputation: 887901

You could use shift from the devel version of data.table i.e. v.1.9.5. The default type is lag and n is 1. Instructions to install the devel version are here

library(data.table)
setDT(data)[, lagdt:= shift(datetime)][]
#    number   datetime      lagdt
#1:      1 2015-06-12       <NA>
#2:      2 2015-06-13 2015-06-12
#3:      3 2015-06-14 2015-06-13
#4:      4 2015-06-15 2015-06-14
#5:      5 2015-06-16 2015-06-15

We can also get multiple lags

setDT(data)[, paste0('lagDT', 1:2) :=shift(datetime, 1:2)][]
#    number   datetime     lagDT1     lagDT2
#1:      1 2015-06-12       <NA>       <NA>
#2:      2 2015-06-13 2015-06-12       <NA>
#3:      3 2015-06-14 2015-06-13 2015-06-12
#4:      4 2015-06-15 2015-06-14 2015-06-13
#5:      5 2015-06-16 2015-06-15 2015-06-14

Upvotes: 3

r2evans
r2evans

Reputation: 160952

When I need a lag like this, I'm often already using dplyr, in which is alag function that meets your needs:

mutate(dat, lagdt=lag(datetime))
##   number   datetime      dtlag      lagdt
## 1      1 2015-06-12 2015-06-12       <NA>
## 2      2 2015-06-13 2015-06-13 2015-06-12
## 3      3 2015-06-14 2015-06-14 2015-06-13
## 4      4 2015-06-15 2015-06-15 2015-06-14
## 5      5 2015-06-16 2015-06-16 2015-06-15

Equivalently, depending on your acceptance of the %>% pipe operator (ceci n'est pas un pipe):

dat %>% mutate(dtlag=lag(datetime))
##   number   datetime      dtlag
## 1      1 2015-06-12       <NA>
## 2      2 2015-06-13 2015-06-12
## 3      3 2015-06-14 2015-06-13
## 4      4 2015-06-15 2015-06-14
## 5      5 2015-06-16 2015-06-15

Upvotes: 0

Related Questions