Nordin
Nordin

Reputation: 78

How do I speed up rolling regressions?

I have a large panel of firms stock returns and value-weighted S&P500 return. I want to apply a rolling window regression, where I regress the firms returns of the previous twelve months on the value weighted S&P500 return, and extract the standard deviation of the squared residuals.

My code looks as follows

stdev <- matrix(NA,nrow = nrow(ReturnMatrix),ncol = 1)
pb <- winProgressBar(title = "",label = "",min = 1,max = 
nrow(ReturnMatrix)-11)
for(i in 1:(nrow(ReturnMatrix)-11))
{
 VWRet <- ReturnMatrix$VWReturn[i:(i+11)]
 Ret <- ReturnMatrix$Return[i:(i+11)]
 if(sum(is.na(Ret)) >= 6)
 {
   stdev[i+11] <- NA  
 }
else{
   Model <- glm(Ret~VWRet-1)
   stdev[i+11] <- sigma(Model)
   }
setWinProgressBar(pb,value = i,title = paste0(round(100*
(i/(nrow(ReturnMatrix) - 11)),2)," % Done"))
}
SD <- cbind.data.frame(ReturnMatrix,stdev)

The dataframe ReturnMatrix is very large, it contains 3239065 rows. The variables in the dataframe are PERMNO which is a firm identifier, YearMonth which is the date in YYYYMM format, Return which is the firms return of that month and VWReturn which is the value weighted S&P500 return.

Right now, it takes about 1 hour to run this for loop.

My question is: Is there any way to speed this process up a notch, I have tried using rollapply on zoo(ReturnMatrix), but this only slows it down even more.

Any help would be greatly appreciated.

Upvotes: 1

Views: 245

Answers (1)

Pierre Lapointe
Pierre Lapointe

Reputation: 16277

Here's how to do that with data.table, which should be the fastest way to do what you want. You first need to build a sigma function and then use rollaplyr with .SD.

set.seed(1)
library(data.table)
dt <- data.table(PERMNO=rep(LETTERS[1:3],each=13),
                 YearMonth=seq.Date(from=Sys.Date(),by="month",length.out =13),
                 Return=runif(39),VWReturn=runif(39))

#create sigma function
stdev <- function(x) sd(lm(x[, 1]~ x[, 2])$residuals)

#create new column with rollapply
dt[,roll_sd:=rollapplyr(.SD, 12, stdev, by.column = FALSE, fill = NA),
    by=.(PERMNO),.SDcols = c("Return", "VWReturn")]

    PERMNO  YearMonth     Return   VWReturn   roll_sd
 1:      A 2017-11-19 0.26550866 0.41127443        NA
 2:      A 2017-12-19 0.37212390 0.82094629        NA
 3:      A 2018-01-19 0.57285336 0.64706019        NA
 4:      A 2018-02-19 0.90820779 0.78293276        NA
 5:      A 2018-03-19 0.20168193 0.55303631        NA
 6:      A 2018-04-19 0.89838968 0.52971958        NA
 7:      A 2018-05-19 0.94467527 0.78935623        NA
 8:      A 2018-06-19 0.66079779 0.02333120        NA
 9:      A 2018-07-19 0.62911404 0.47723007        NA
10:      A 2018-08-19 0.06178627 0.73231374        NA
11:      A 2018-09-19 0.20597457 0.69273156        NA
12:      A 2018-10-19 0.17655675 0.47761962 0.3181427
13:      A 2018-11-19 0.68702285 0.86120948 0.3141638
14:      B 2017-11-19 0.38410372 0.43809711        NA
....

Upvotes: 1

Related Questions