Daniel
Daniel

Reputation: 572

Lag values for supervised learning in sequence / time-series

I try to figure out how to solve this problem in R. I want to use different machine learning regression models on time series data, which is in the area of supervised learning. In that case I need a function / package that allows me to go n-step forward and n-step back, like a sliding window function. The table shows the input (t-n) and output (t+n) variables with the current observation (t) considered an output.

       var1(t-1)  var2(t-1)  var1(t)  var2(t)  var1(t+1)  var2(t+1)
1        4           69        5       70         6       71
2        5           70        6       71         7       72
3        6           71        7       72         8       73
4        7           72        8       73         9       74
5        8           73        9       74        10       75
6        9           74        10      75        11       76
7        10          75        11      76        12       77
8        11          76        12      77        13       78

I already researched about some useful methods such as lag() or the shift() method at r-blogger.com, but at these examples the problem is that missing values will generate.

    shift<-function(x,shift_by){
    stopifnot(is.numeric(shift_by))
    stopifnot(is.numeric(x))

    if (length(shift_by)>1)
        return(sapply(shift_by,shift, x=x))

    out<-NULL
    abs_shift_by=abs(shift_by)
    if (shift_by > 0 )
        out<-c(tail(x,-abs_shift_by),rep(NA,abs_shift_by))
    else if (shift_by < 0 )
        out<-c(rep(NA,abs_shift_by), head(x,-abs_shift_by))
    else
        out<-x
    out
}

Result of the shift() function:

    x df_lead2 df_lag2
1   5      4      NA
2   6      5      NA
3   7      6      5
4   8      7      6
5   9      8      7
6   10     9      8
7   11     10     9
8   12     11     10
9   13     NA     11
10  14     NA     12

So are there any packages or implemented functions, that allows to receive a dataframe and calculate for each variable the amount of indicates t-n or t+n?

Would be so nice if someone can help me. Thanks!

Upvotes: 1

Views: 587

Answers (1)

timfaber
timfaber

Reputation: 2070

You might be able to use rollapply (zoo):

rollapply(iris$Sepal.Length, width = 3, by = 2, FUN = mean, align = "left")

You can specify whether you want to compute values (or not) depending if there is a subsequent value (https://rdrr.io/cran/rowr/man/rollApply.html)

Upvotes: 1

Related Questions