Mansoor
Mansoor

Reputation: 1259

Lag computation in Time Series with missing value in R

I have a time series like. I want to compute lag N only if the date-time is continuous and skip computing lag when I encounter missing data. I don't want to compute lag when previous entry is at more than N hours interval in R

                   t         val
  2005-01-17 17:30:00       14.3
  2005-01-17 18:30:00       14.0
  2005-01-17 19:30:00       14.3
  2005-01-17 22:30:00       14.9
  2005-01-17 23:30:00       14.2
  2005-01-18 00:30:00       14.1

There are missing entry for dates 2005-01-17 20:30:00 2005-01-17 21:30:00. I want to compute lag N only if the date-time is continuous and skip computing lag when I encounter missing data.

Expected Output Result

                   t         val   val_lag   val_lag2
  2005-01-17 17:30:00       14.3        NA         NA
  2005-01-17 18:30:00       14.0      14.3         NA
  2005-01-17 19:30:00       14.3      14.0       14.3
  2005-01-17 22:30:00       14.9        NA         NA
  2005-01-17 23:30:00       14.2      14.9         NA
  2005-01-18 00:30:00       14.1      14.2       14.9

Thanks

Upvotes: 1

Views: 476

Answers (1)

akrun
akrun

Reputation: 887038

We could create a grouping variable by taking the diff of the 't' column and then get the lag of 'val'

library(dplyr)
df1 %>%
   group_by(grp = cumsum(c(TRUE, diff(t)!=1))) %>% 
   mutate(val_lag = lag(val)) %>%
   ungroup() %>%
   select(-grp)

Upvotes: 2

Related Questions