Reputation: 204
I have a df (data
) that I want to pass as an argument to a function fun.lag_cols
to calculate (for each column in df) several lags. The results must be stored in a nested list, but my function seems to be missing (at least) one step.
data <- data.frame(x1 = rnorm(10,0,1)
, x2 = rnorm(10,2,3)
, x3 = rnorm(10,6,1))
fun.lag_cols <- function(x, lag_from = 0, lag_to = 2) {
x <- as.data.frame(x)
cols_x <- ncol(x)
lst_lag <- list()
for (i in 1:cols_x) {
for(j in lag_from:lag_to) {
lst_lag[[i]] <- dplyr::lag(x[,i],j)
}
}
return(lst_lag)
}
output <- fun.lag_cols(data)
In this particular example, I would like to see output
as a list of 3 elements (x1, x2, x3), each element a new list of 3 (one per lag 0, 1, 2).
My code seems to store only lag2 (in general, the maximum lag) for each variable, clearly not the expected result.
I am open to different approaches, as long as they provide the final output (nested list).
Thanks
Upvotes: 1
Views: 117
Reputation: 388972
Using lapply
:
fun.lag_cols <- function(x, lag_from = 0, lag_to = 2) {
val <- lag_from:lag_to
lapply(x, function(v)
setNames(lapply(val, function(n) dplyr::lag(v, n)), paste0('lag_', val)))
}
fun.lag_cols(data)
#$x1
#$x1$lag_0
# [1] -1.5095832 -0.2638919 0.5986575 3.3043298 0.9471048 -1.2154015
# [7] 0.8921754 -1.6614204 -0.2036500 0.9570701
#$x1$lag_1
# [1] NA -1.5095832 -0.2638919 0.5986575 3.3043298 0.9471048
# [7] -1.2154015 0.8921754 -1.6614204 -0.2036500
#$x1$lag_2
# [1] NA NA -1.5095832 -0.2638919 0.5986575 3.3043298
# [7] 0.9471048 -1.2154015 0.8921754 -1.6614204
#$x2
#$x2$lag_0
# [1] -4.8181366 4.1741754 4.6560021 -0.5167334 1.5284542 8.7717049
# [7] -0.2104695 2.4273092 1.4985899 2.7356401
#$x2$lag_1
# [1] NA -4.8181366 4.1741754 4.6560021 -0.5167334 1.5284542
# [7] 8.7717049 -0.2104695 2.4273092 1.4985899
#$x2$lag_2
# [1] NA NA -4.8181366 4.1741754 4.6560021 -0.5167334
# [7] 1.5284542 8.7717049 -0.2104695 2.4273092
#$x3
#$x3$lag_0
# [1] 7.712619 5.237124 5.798063 5.695696 5.127347 3.789074 5.830557
# [8] 3.801073 5.794048 5.227110
#$x3$lag_1
# [1] NA 7.712619 5.237124 5.798063 5.695696 5.127347 3.789074
# [8] 5.830557 3.801073 5.794048
#$x3$lag_2
# [1] NA NA 7.712619 5.237124 5.798063 5.695696 5.127347
# [8] 3.789074 5.830557 3.801073
Upvotes: 1
Reputation: 887088
We could change the assignment of the 'lst_lag[[i]]' by concatenating the element with the lag
value inside the nested loop. In the function, there are two changes - 1) initialize an output list with predefined length (vector('list', ncol(x))
), 2) inside the nested loop, where we append those i
th list elements with new child list elements by concatenating the already existing list
with the new list
created by wrapping the lag
inside a list
, while recursively updating the same list element (<-
)
fun.lag_cols <- function(x, lag_from = 0, lag_to = 2) {
x <- as.data.frame(x)
cols_x <- ncol(x)
lst_lag <- vector('list', ncol(x))
for (i in 1:cols_x) {
for(j in lag_from:lag_to) {
lst_lag[[i]] <- c(lst_lag[[i]], list(dplyr::lag(x[,i],j)))
}
}
return(lst_lag)
}
-testing
fun.lag_cols(data)
[[1]]
[[1]][[1]]
[1] -1.40431393 -2.22551238 0.06090537 0.77941726 1.10733091 1.20657717 0.71614034 -0.17990135 0.22058894 0.33598415
[[1]][[2]]
[1] NA -1.40431393 -2.22551238 0.06090537 0.77941726 1.10733091 1.20657717 0.71614034 -0.17990135 0.22058894
[[1]][[3]]
[1] NA NA -1.40431393 -2.22551238 0.06090537 0.77941726 1.10733091 1.20657717 0.71614034 -0.17990135
[[2]]
[[2]][[1]]
[1] 1.1334651 1.2385579 1.8930347 -4.7379766 2.0169352 0.7210822 -1.0322536 4.5446643 1.4421923 1.1316508
[[2]][[2]]
[1] NA 1.1334651 1.2385579 1.8930347 -4.7379766 2.0169352 0.7210822 -1.0322536 4.5446643 1.4421923
[[2]][[3]]
[1] NA NA 1.1334651 1.2385579 1.8930347 -4.7379766 2.0169352 0.7210822 -1.0322536 4.5446643
[[3]]
[[3]][[1]]
[1] 4.324912 5.114774 4.517017 7.001338 5.218430 4.408571 7.233504 6.875883 5.848294 4.696724
[[3]][[2]]
[1] NA 4.324912 5.114774 4.517017 7.001338 5.218430 4.408571 7.233504 6.875883 5.848294
[[3]][[3]]
[1] NA NA 4.324912 5.114774 4.517017 7.001338 5.218430 4.408571 7.233504 6.875883
There is already a function available to do this i.e. shift
from data.table
which take a vectorized n
library(data.table)
shift(data, n = 0:2)
Upvotes: 1