Reputation: 78
I have a large panel of firms stock returns and value-weighted S&P500 return. I want to apply a rolling window regression, where I regress the firms returns of the previous twelve months on the value weighted S&P500 return, and extract the standard deviation of the squared residuals.
My code looks as follows
stdev <- matrix(NA,nrow = nrow(ReturnMatrix),ncol = 1)
pb <- winProgressBar(title = "",label = "",min = 1,max =
nrow(ReturnMatrix)-11)
for(i in 1:(nrow(ReturnMatrix)-11))
{
VWRet <- ReturnMatrix$VWReturn[i:(i+11)]
Ret <- ReturnMatrix$Return[i:(i+11)]
if(sum(is.na(Ret)) >= 6)
{
stdev[i+11] <- NA
}
else{
Model <- glm(Ret~VWRet-1)
stdev[i+11] <- sigma(Model)
}
setWinProgressBar(pb,value = i,title = paste0(round(100*
(i/(nrow(ReturnMatrix) - 11)),2)," % Done"))
}
SD <- cbind.data.frame(ReturnMatrix,stdev)
The dataframe ReturnMatrix is very large, it contains 3239065 rows. The variables in the dataframe are PERMNO which is a firm identifier, YearMonth which is the date in YYYYMM format, Return which is the firms return of that month and VWReturn which is the value weighted S&P500 return.
Right now, it takes about 1 hour to run this for loop.
My question is: Is there any way to speed this process up a notch, I have tried using rollapply on zoo(ReturnMatrix), but this only slows it down even more.
Any help would be greatly appreciated.
Upvotes: 1
Views: 245
Reputation: 16277
Here's how to do that with data.table
, which should be the fastest way to do what you want. You first need to build a sigma function and then use rollaplyr
with .SD
.
set.seed(1)
library(data.table)
dt <- data.table(PERMNO=rep(LETTERS[1:3],each=13),
YearMonth=seq.Date(from=Sys.Date(),by="month",length.out =13),
Return=runif(39),VWReturn=runif(39))
#create sigma function
stdev <- function(x) sd(lm(x[, 1]~ x[, 2])$residuals)
#create new column with rollapply
dt[,roll_sd:=rollapplyr(.SD, 12, stdev, by.column = FALSE, fill = NA),
by=.(PERMNO),.SDcols = c("Return", "VWReturn")]
PERMNO YearMonth Return VWReturn roll_sd
1: A 2017-11-19 0.26550866 0.41127443 NA
2: A 2017-12-19 0.37212390 0.82094629 NA
3: A 2018-01-19 0.57285336 0.64706019 NA
4: A 2018-02-19 0.90820779 0.78293276 NA
5: A 2018-03-19 0.20168193 0.55303631 NA
6: A 2018-04-19 0.89838968 0.52971958 NA
7: A 2018-05-19 0.94467527 0.78935623 NA
8: A 2018-06-19 0.66079779 0.02333120 NA
9: A 2018-07-19 0.62911404 0.47723007 NA
10: A 2018-08-19 0.06178627 0.73231374 NA
11: A 2018-09-19 0.20597457 0.69273156 NA
12: A 2018-10-19 0.17655675 0.47761962 0.3181427
13: A 2018-11-19 0.68702285 0.86120948 0.3141638
14: B 2017-11-19 0.38410372 0.43809711 NA
....
Upvotes: 1