user555265
user555265

Reputation: 493

Rolling multi regression in R data table

Say I have an R data.table DT which has a list of returns:

Date          Return
2016-01-01    -0.01
2016-01-02    0.022
2016-01-03    0.1111
2016-01-04    -0.006
...

I want to do a rolling multi regression of the previous N observations of Return predicting the next Return over some window K. E.g. Over the last K = 120 days do a regression of the last N = 14 observations to predict the next observation. Once I have this regression I want to use the predict function to get a prediction for each row based on the regression. In pseudocode it would be something like:

 DT[, Prediction := predict(lm(Return[prev K - N -1] ~ Return[N observations prev for each observation]), Return[N observations previous for this observation])]

To be clear i want to do a multi regression so if N was 3 it would be:

lm(Return ~ Return[-1] + Return[-2] + Return[-3])  ## where the negatives are the prev rows

How do I write this (as efficiently as possible).

Thanks

Upvotes: 0

Views: 830

Answers (2)

Hack-R
Hack-R

Reputation: 23214

If I understand correctly you want a quarterly auto-regression.

There's a related thread on time-series with data.table here.

You can setup a rolling date in data.table like this (see the link above for more context):

#Example for quarterly data
quarterly[, rollDate:=leftBound]
storeData[, rollDate:=date]

setkey(quarterly,"rollDate")
setkey(storeData,"rollDate")

Since you only provided a few rows of example data, I extended the series through 2019 and made up random return values.

First get your data setup:

require(forecast)
require(xts)
DT <- read.table(con<- file ( "clipboard"))
dput(DT) # the dput was too long to display here
DT[,1] <- as.POSIXct(strptime(DT[,1], "%m/%d/%Y"))
DT[,2] <- as.double(DT[,2])
dat <- xts(DT$V2,DT$V1, order.by = DT$V1)

x.ts          <- to.quarterly(dat) # 120 days

        dat.Open dat.High dat.Low dat.Close
2016 Q1     1292     1292       1       698
2016 Q2      138     1290       3       239
2016 Q3      451     1285       5       780
2016 Q4      355     1243      27      1193
2017 Q1      878     1279       4       687
2017 Q2      794     1283      12       411
2017 Q3      858     1256       9      1222
2017 Q4      219     1282      15       117
2018 Q1      554     1286      32       432
2018 Q2      630     1272      30        46
2018 Q3      310     1288      18       979
2019 Q1      143     1291      10       184
2019 Q2      250     1289       8       441
2019 Q3      110     1220      23       571

Then you can do a rolling ARIMA model with or without re-estimation like this:

fit <- auto.arima(x.ts)
order <- arimaorder(fit)
fcmat <- matrix(0, nrow=nrow(x), ncol=1)
n   <- nrow(x)
for(i in 1:n)
{  
  x <- window(x.ts, end=2017.99 + (i-1)/4)
  refit <- Arima(x, order=order[1:3], seasonal=order[4:6])
  fcmat[i,] <- forecast(refit, h=h)$mean
}

Here's a good related resource with several examples of different ways you might construct this: http://robjhyndman.com/hyndsight/rolling-forecasts/

Upvotes: 4

bartleby
bartleby

Reputation: 107

You have to have the lags in the columns anyway, so I if i understand you correctly you can do something like this, say for a lag of 3:

setkey(DT,date)
lag_max<-3
for(i in 1:lag_max){
 set(DT,NULL,paste0("lag",i),shift(DT[["return"]],1L,type="lag"))
}
DT[, prediction :=  lm(return~lag1+lag2+lag3)[["fitted.values"]]]

Upvotes: 0

Related Questions