Reputation: 11
I'm running also a dynlm regression in R.
I have question concerning a loop to determine the optimal time-lag?
I like to know, which time-lag is optimal for a certain variable - ceteris paribus.
Is there a way to plot it or how should I work on?
dynlm(y ~ a + b + c + L(d,y))
I would like to know which time-lag is most significant?
Upvotes: 1
Views: 4425
Reputation: 170
I would use auto.arima() from forecast package, which is much better suited than dynlm for this purpose. auto.arima() selects best model based on various criteria (AICC, AIC or BIC). It could be used stepwise or comprehensive. If data is not seasonal just use the option seasonal =F. In your case to run all cases up to lag 10 you could use:
require(forecast)
# maximum lag allowed
maxlag <- 10
#Converts formula to matrix
mm <- model.matrix(~a+b+c-1, data=mydata)
y <- as.ts(mydata$y)
fit <- auto.arima(y, xreg=mm, d=0,max.p=maxlag, max.order=maxlag,max.q=0,
seasonal=FALSE,trace=TRUE, stepwise =FALSE)
print(fit)
Upvotes: 1
Reputation: 17193
Personally, I would probably manually set up a data.frame
with the lagged variables and then use the variable selection method you feel comfortable about, e.g., stepwise selection via step()
, all-subsets regression for example via leaps
, or lasso et al. via glmnet
etc.
As a simple example based on the Nile
time series:
d <- as.data.frame(ts.intersect(
y = Nile,
y1 = lag(Nile, -1),
y2 = lag(Nile, -2),
y3 = lag(Nile, -3),
y4 = lag(Nile, -4),
y5 = lag(Nile, -5)
))
m <- lm(y ~ y1 + y2 + y3 + y4 + y4 + y5, data = d)
m2 <- step(m)
coef(m2)
## (Intercept) y1 y2
## 381.8464340 0.3923535 0.1835561
This stepwise AIC-based selection could also be done via dynlm
but you need to be careful that all models are fitted to the same subset of the data. (In the approach above ts.intersect
is making sure the right subset without NA
s is used.)
dm <- dynlm(Nile ~ L(Nile, 1) + L(Nile, 2) + L(Nile, 3) + L(Nile, 4) + L(Nile, 5),
start = 1876)
dm2 <- step(dm)
coef(dm2)
(Intercept) L(Nile, 1) L(Nile, 2)
381.8464340 0.3923535 0.1835561
The advantage of the former approach is that it can be really used with everything that you would use for non-dynamic linear regressions as well. The latter works also for step()
but maybe not for others. (Note that I'm not particularly endorsing AIC-based stepwise selection...)
Upvotes: 2