Krellex
Krellex

Reputation: 753

Lagging time series data

I was looking to build a Neural network model for prediction. I am trying to get my data to be in the format that is shown in the image so the model can predict given the previous values of (2 days ago, 1 day ago, today). These values will be adjusted the next day for the next prediction e.g. 1 day ago from first input becomes 2 days ago in the second as seen in the image. I am using the lag function to lag the data for time series and a matrix to set up the data in this format but I am quite confused and struggling with this for sometime. How could I lag the data as shown in the image using the lag function?

enter image description here

My current code:

data <-
  structure(
    list(
      `USD/EUR` = c(
        1.373,
        1.386,
        1.3768,
        1.3718,
        1.3774,
        1.3672,
        1.3872,
        1.3932,
        1.3911,
        1.3838,
        1.4171,
        1.4164,
        1.3947,
        1.3675,
        1.3801,
        1.3744,
        1.3759,
        1.3743,
        1.3787,
        1.3595,
        1.3599,
        1.3624,
        1.3523,
        1.3506,
        1.3521
      )
    ),
    row.names = c(NA,-25L),
    class = c("tbl_df",
              "tbl", "data.frame")
  )

#Lag the data
lagData <- c(lag(data$`USD/EUR`,k = 1))
lagData

#store data into matrix to feed to neural net
matrixForm <- matrix(lagData, nrow = 25, ncol = 4, byrow = TRUE)
matrixForm

Upvotes: 1

Views: 482

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269704

1) Use embed. No packages are used.

embed(data[[1]], 4)[, 4:1]

giving this matrix:

        [,1]   [,2]   [,3]   [,4]
 [1,] 1.3730 1.3860 1.3768 1.3718
 [2,] 1.3860 1.3768 1.3718 1.3774
 [3,] 1.3768 1.3718 1.3774 1.3672
 [4,] 1.3718 1.3774 1.3672 1.3872
 ...snip...

2) Another possibility is flag (fast lag) in the collapse package:

na_omit(flag(data[[1]], 3:0))

giving:

          L3     L2     L1     --
 [1,] 1.3730 1.3860 1.3768 1.3718
 [2,] 1.3860 1.3768 1.3718 1.3774
 [3,] 1.3768 1.3718 1.3774 1.3672
 [4,] 1.3718 1.3774 1.3672 1.3872
 ...snip...

3) zoo supports multiple lags in a similar manner to flag except it uses the same orientation as in R. Be sure that dplyr is not loaded since it overwrites lag.

library(zoo)

na.omit(lag(zoo(data[[1]]), -3:0))

giving this zoo object:

    lag-3  lag-2  lag-1   lag0
4  1.3730 1.3860 1.3768 1.3718
5  1.3860 1.3768 1.3718 1.3774
6  1.3768 1.3718 1.3774 1.3672
7  1.3718 1.3774 1.3672 1.3872
...snip...

Upvotes: 2

Related Questions