Reputation: 111
I was trying something like the auto.arima
example in https://otexts.com/fpp2/lagged-predictors.html and noticed I get different results depending on whether I specify (all) rows of data explicitly or not. MWE:
library(forecast); library(fpp2)
nrow(insurance)
auto.arima(insurance[,1], xreg=insurance[,2], stationary=TRUE)
auto.arima(insurance[1:40,1], xreg=insurance[1:40,2], stationary=TRUE)
The nrow(insurance)
shows there are 40 rows, so I'd think insurance[,1]
would be the same as insurance[1:40,1]
, and similarly for the second column. Yet, the first way results in a "Regression with ARIMA(3,0,0) errors" whereas the second way results in a "Regression with ARIMA(1,0,2) errors."
Why do these seemingly equivalent calls result in different selected models?
Upvotes: 0
Views: 590
Reputation: 111
Corey nudged me in the right direction: insurance[,1]
is a "time series" whereas insurance[1:40,1]
is numeric. That is, is.ts(insurance[,1])
is TRUE
but is.ts(insurance[1:40,1])
is FALSE
. The forecast
package has a subset
function that preserves the time series structure, so is.ts(subset(insurance[,1],start=1,end=40))
is TRUE
and
auto.arima(subset(insurance[,1],start=1,end=40),
xreg=subset(insurance[,2],start=1,end=40), stationary=TRUE)
gives the same output as the first version in my question (with insurance[,1]
and insurance[,2]
).
I think that explains "why" at least superficially, although I don't understand
1) why the time series structure changes the result here (since there doesn't seem to be any seasonality in the selected models?), and
2) why in the linked example Hyndman uses insurance[4:40,1]
instead of his own subset()
function from his forecast
package?
I'll wait to see if somebody wants to answer those "deeper" questions, otherwise I'll probably accept this answer.
Upvotes: 0
Reputation: 1651
Note that insurance[,1]
has labels and insurance[1:40,1]
does not. If you pass as.numeric(insurance[,1])
you will actually receive "ARIMA(1,0,2)". So I bet it has to do with if the first argument has labels or not...Also note that it doesn't matter if xreg=insurance[,2]
or xreg=insurance[1:40,2]
they both will work
Upvotes: 1