Reputation: 35
I would like to include multiple lags of an exogenous variable in a regression. Let's say that I have the following data:
X = c(1, 4, 8, 9, 3, 5...)
X2 = c(4, 6, 7, 9, 7, 8...)
I want to use lags of X2
to predict X
. Does anyone know why package allows for me to do this? I have tried using dynlm
and lag()
from stats.
Thanks
Upvotes: 1
Views: 1893
Reputation: 4060
No external R library is required, I would say
X2 = c(4, 6, 7, 9, 7, 8)
lag = 2
lagged_data <- function(x) c(tail(X2, -x), rep(NA, x))
lagged_data(lag)
# [1] 7 9 7 8 NA NA
Upvotes: 0
Reputation: 269870
This performs an ordinary linear regression of X on the first 2 lags of X2 with an intercept (fit2), on the first lag with an intercept (fit1) and just on an intercept (fit0). Note that in R one normally uses negative numbers to lag so for convenience we defined a Lag
function which uses positive numbers to indicate lags. lag.zoo allows vector lags so Lag(z2, 1:2) has two columns, one column for each of the two lags.
library(dyn)
X = c(1, 4, 8, 9, 3, 5)
X2 = c(4, 6, 7, 9, 7, 8)
z <- zoo(X)
z2 <- zoo(X2)
Lag <- function(x, k = 1) lag(x, k = -k)
fit2 <- dyn$lm(z ~ Lag(z2, 1:2))
fit1 <- dyn$lm(z ~ Lag(z2))
fit0 <- dyn$lm(z ~ 1)
For example, here is fit2.
> fit2
Call:
lm(formula = dyn(z ~ Lag(z2, 1:2)))
Coefficients:
(Intercept) Lag(z2, 1:2)1 Lag(z2, 1:2)2
19.3333 -1.4242 -0.4242
Here is a comparison of the three fits showing that the one and two lag fits are not significantly better than just using the intercept; however, there is a quite drop in residual sum of squares by adding the first lag to the intercept only model so you might want to ignore the statistical significance and use the first lag anyways.
> anova(fit0, fit1, fit12)
Analysis of Variance Table
Model 1: z ~ 1
Model 2: z ~ Lag(z2)
Model 3: z ~ Lag(z2, 1:2)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 3 22.7500
2 2 8.4211 1 14.3289 2.1891 0.3784
3 1 6.5455 1 1.8756 0.2865 0.6871
It would also be possible to use ts class in place of the zoo class; however, lag.ts does not support vector lags so with ts each term would have to be written out separately. Lag
is from above.
tt <- ts(X)
tt2 <- ts(X2)
fits12_ts <- dyn$lm(tt ~ Lag(tt2) + Lag(tt2, 2))
Upvotes: 1
Reputation: 3026
library(zoo)
set.seed(1111)
x <- as.zoo(rnorm(10, 0, 0.02))
y <- lag(x, 2, na.pad = TRUE)
cbind(x, y)
Upvotes: 1