Reputation: 47
My questions concerns lagging data in r where r should be aware of the time index. I hope the question has not been asked in any further thread. Lets consider a simple setup:
df <- data.frame(date=as.Date(c("1990-01-01","1990-02-01","1990-01-15","1990-03-01","1990-05-01","1990-07-01","1993-01-02")), value=1:7)
This code should generate a table like
date | value |
---|---|
1990-01-01 | 1 |
1990-02-01 | 2 |
1990-01-15 | 3 |
1990-03-01 | 4 |
1990-05-01 | 5 |
1990-07-01 | 6 |
And my aim is now to try to lag the "value" by e.g. one month such that e.g when I try to compute the lagged value of "1990-05-01" (which would be 1990-04-01 but is not present in the data) should then generate an NA in the specific row. When I use the standard lag function r is not aware of the time index and simply uses the value "4" of 1990-03-01 which is not what I want. Has anyone an idea what I could do here?
Thanks in advance! :)
All the best,
Leon
Upvotes: 1
Views: 181
Reputation: 47
For an example with multiple columns lets consider:
df <- data.frame(date=as.Date(c("1990-01-01","1990-02-01","1990-01-15","1990-03-01","1990-05-01","1990-07-01","1993-01-02")), value=1:7, value2=7:13)
I recently found myself the following solution:
df %>%
as_tibble() %>%
mutate(across(2:ncol(df), .fns= function(x){x[match(date %m-% months(lags),date)]}, .names="{.col}_lag"))
Thanks to your code @ThomasisCoding. :)
Upvotes: 0
Reputation: 101335
You can try %m-%
for lagged month like below
library(lubridate)
transform(
df,
value_lag = value[match(date %m-% months(1), date)]
)
which gives
date value value_lag
1 1990-01-01 1 NA
2 1990-02-01 2 1
3 1990-01-15 3 NA
4 1990-03-01 4 2
5 1990-05-01 5 NA
6 1990-07-01 6 NA
7 1993-01-02 7 NA
Upvotes: 2