Leon
Leon

Reputation: 47

Is there a possibility to lag values of a data frame in r indexed by time?

My questions concerns lagging data in r where r should be aware of the time index. I hope the question has not been asked in any further thread. Lets consider a simple setup:

df <- data.frame(date=as.Date(c("1990-01-01","1990-02-01","1990-01-15","1990-03-01","1990-05-01","1990-07-01","1993-01-02")), value=1:7)

This code should generate a table like

date value
1990-01-01 1
1990-02-01 2
1990-01-15 3
1990-03-01 4
1990-05-01 5
1990-07-01 6

And my aim is now to try to lag the "value" by e.g. one month such that e.g when I try to compute the lagged value of "1990-05-01" (which would be 1990-04-01 but is not present in the data) should then generate an NA in the specific row. When I use the standard lag function r is not aware of the time index and simply uses the value "4" of 1990-03-01 which is not what I want. Has anyone an idea what I could do here?

Thanks in advance! :)

All the best,

Leon

Upvotes: 1

Views: 181

Answers (2)

Leon
Leon

Reputation: 47

For an example with multiple columns lets consider:

df <- data.frame(date=as.Date(c("1990-01-01","1990-02-01","1990-01-15","1990-03-01","1990-05-01","1990-07-01","1993-01-02")), value=1:7, value2=7:13)

I recently found myself the following solution:

df %>%
  as_tibble() %>%
  mutate(across(2:ncol(df), .fns= function(x){x[match(date %m-% months(lags),date)]}, .names="{.col}_lag"))

Thanks to your code @ThomasisCoding. :)

Upvotes: 0

ThomasIsCoding
ThomasIsCoding

Reputation: 101335

You can try %m-% for lagged month like below

library(lubridate)
transform(
  df,
  value_lag = value[match(date %m-% months(1), date)]
)

which gives

        date value value_lag
1 1990-01-01     1        NA
2 1990-02-01     2         1
3 1990-01-15     3        NA
4 1990-03-01     4         2
5 1990-05-01     5        NA
6 1990-07-01     6        NA
7 1993-01-02     7        NA

Upvotes: 2

Related Questions