Marco Grazioli
Marco Grazioli

Reputation: 11

R: dynlm - how to determine the most signficiant time-lag of a lagged variable

I'm running also a dynlm regression in R.

I have question concerning a loop to determine the optimal time-lag?

I like to know, which time-lag is optimal for a certain variable - ceteris paribus.

Is there a way to plot it or how should I work on?

dynlm(y ~ a + b + c + L(d,y))

I would like to know which time-lag is most significant?

Upvotes: 1

Views: 4425

Answers (2)

Acoustesh
Acoustesh

Reputation: 170

I would use auto.arima() from forecast package, which is much better suited than dynlm for this purpose. auto.arima() selects best model based on various criteria (AICC, AIC or BIC). It could be used stepwise or comprehensive. If data is not seasonal just use the option seasonal =F. In your case to run all cases up to lag 10 you could use:

        require(forecast)
        # maximum lag allowed
        maxlag <- 10
        #Converts formula to matrix
        mm <- model.matrix(~a+b+c-1, data=mydata)
        y <- as.ts(mydata$y)
        fit <- auto.arima(y, xreg=mm, d=0,max.p=maxlag, max.order=maxlag,max.q=0,
                seasonal=FALSE,trace=TRUE, stepwise =FALSE)
        print(fit)

Upvotes: 1

Achim Zeileis
Achim Zeileis

Reputation: 17193

Personally, I would probably manually set up a data.frame with the lagged variables and then use the variable selection method you feel comfortable about, e.g., stepwise selection via step(), all-subsets regression for example via leaps, or lasso et al. via glmnet etc.

As a simple example based on the Nile time series:

d <- as.data.frame(ts.intersect(
  y = Nile,
  y1 = lag(Nile, -1),
  y2 = lag(Nile, -2),
  y3 = lag(Nile, -3),
  y4 = lag(Nile, -4),
  y5 = lag(Nile, -5)
))
m <- lm(y ~ y1 + y2 + y3 + y4 + y4 + y5, data = d)
m2 <- step(m)
coef(m2)
## (Intercept)          y1          y2 
## 381.8464340   0.3923535   0.1835561 

This stepwise AIC-based selection could also be done via dynlm but you need to be careful that all models are fitted to the same subset of the data. (In the approach above ts.intersect is making sure the right subset without NAs is used.)

dm <- dynlm(Nile ~ L(Nile, 1) + L(Nile, 2) + L(Nile, 3) + L(Nile, 4) + L(Nile, 5),
  start = 1876)
dm2 <- step(dm)
coef(dm2)
(Intercept)  L(Nile, 1)  L(Nile, 2) 
381.8464340   0.3923535   0.1835561 

The advantage of the former approach is that it can be really used with everything that you would use for non-dynamic linear regressions as well. The latter works also for step() but maybe not for others. (Note that I'm not particularly endorsing AIC-based stepwise selection...)

Upvotes: 2

Related Questions