Reputation: 1305
Ok so I am looking to create rolling lagged differences in R.
vec <- c(43.79979, 44.04865, 44.17308, 44.54638, 44.79524, 44.79524, 44.79524, 44.42195, 44.54638, 44.79524, 44.42195, 43.30206, 43.30206, 43.17764, 43.30206)
> length(vec)
[1] 15
This is what I have tried so far:
vec1 <- rollapply(vec, width = 2, fill = NA, FUN = diff)
This gives this output:
[1] 0.24886 0.12443 0.37330 0.24886 0.00000 0.00000 -0.37329 0.12443 0.24886 -0.37329 -1.11989 0.00000 -0.12442 0.12442 NA
> length(vec1)
[1] 15
Note we have an NA value in element 15.
So I want to do this diff in lags for say lags 1,2 and 3... So the above code doesn't cater for this, so I try the below:
lag1 <- diff(vec, lag = 1, differences = 1, arithmetic = TRUE, na.pad = TRUE)
lag2 <- diff(vec, lag = 2, differences = 1, arithmetic = TRUE, na.pad = TRUE)
lag3 <- diff(vec, lag = 3, differences = 1, arithmetic = TRUE, na.pad = TRUE)
length(lag1)
length(lag2)
length(lag3)
The result of this:
> lag1
[1] 0.24886 0.12443 0.37330 0.24886 0.00000 0.00000 -0.37329 0.12443 0.24886 -0.37329 -1.11989 0.00000 -0.12442 0.12442
> lag2
[1] 0.37329 0.49773 0.62216 0.24886 0.00000 -0.37329 -0.24886 0.37329 -0.12443 -1.49318 -1.11989 -0.12442 0.00000
> lag3
[1] 0.74659 0.74659 0.62216 0.24886 -0.37329 -0.24886 0.00000 0.00000 -1.24432 -1.49318 -1.24431 0.00000
> length(lag1)
[1] 14
> length(lag2)
[1] 13
> length(lag3)
[1] 12
Notice that when do the lagged difference above... it places the diff result on the line that it subtracted the value on... so it took our current value - lagged value. It places the diff result on the lagged value position. We then lose the length of the vector. I want to actually place the diff - lagged result on the start number (diff) and place leading NA's to account for the missing values at the start of the data set.
Using lag 2 as en example, this is my desired result:
> lag2
[1] NA NA 0.37329 0.49773 0.62216 0.24886 0.00000 -0.37329 -0.24886 0.37329 -0.12443 -1.49318 -1.11989 -0.12442 0.00000
Does anyone know a way on how to correct this??
To maybe explain a little more:
this is is the start of the vector:
vec <- c(43.79979, 44.04865, 44.17308.....
So if we do a lagged 2 difference...
We take the 3rd element... 44.17308
- 43.79979
= the result of 0.37329
.
So I want to have NA NA 0.37329
Instead of placing 0.37329
on the first position in the new lag2 vector.
Upvotes: 3
Views: 1525
Reputation: 2684
For those looking a tidyverse
solution, one option is to use dplyr::lag
, which I find more intuitive than that with base::apply
.
vec - dplyr::lag(vec, n = 2)
So the idea is basically to generate a second vector with the positions n-lagged, and just substracting the two vectors without further complication, making the most of vectorized functions in R.
Upvotes: 1
Reputation: 3905
Just like in Zoo lag diff back in data frame
vec = c(43.79979, 44.04865, 44.17308, 44.54638, 44.79524, 44.79524, 44.79524, 44.42195, 44.54638, 44.79524, 44.42195, 43.30206, 43.30206, 43.17764, 43.30206)
require(zoo)
apply(lag(zoo(vec), c(-2,0), na.pad = TRUE), 1L, diff)
#> apply(lag(zoo(vec), c(-2,0), na.pad = TRUE), 1L, diff)
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
# NA NA 0.37329 0.49773 0.62216 0.24886 0.00000 -0.37329 -0.24886 0.37329 -0.12443 -1.49318 -1.11989 -0.12442 0.00000
On May 10th 2018 it was pointed to me by @thistleknot (thanks!) that dplyr
masks stats
's own lag
generic. Therefore make sure you don't have dplyr
attached, or instead run stats::lag
explicitly, otherwise my code won't run.
I think I found the culprit: github.com/tidyverse/dplyr/issues/1586 answer: This is a natural consequence of having lots of R packages. Just be explicit and use stats::lag or dplyr::lag
Upvotes: 3