Reputation: 5660
I'm fitting an Arima
(2,0,0) model using the forecast
package in R on the usconsumption
dataset. However, when I mimic the same fit using lm
, I get different coefficients. My understanding is that they should be the same. Below is my code.
> library(forecast)
> library(fpp)
>
> #load data
> data("usconsumption")
>
> #create equivalent data frame from time-series
> lagpad <- function(x, k=1) {
+ c(rep(NA, k), x)[1 : length(x)]
+ }
>
> usconsumpdf <- as.data.frame(usconsumption)
> usconsumpdf$consumptionLag1 <- lagpad(usconsumpdf$consumption)
> usconsumpdf$consumptionLag2 <- lagpad(usconsumpdf$consumption, 2)
>
> #create arima and lm models
> arima1 <- Arima(usconsumption[,1], xreg=usconsumption[,2], order=c(2,0,0))
> lm1 <- lm(consumption~consumptionLag1+consumptionLag2+income, data=usconsumpdf)
>
> #show coefficients
> arima1
Series: usconsumption[, 1]
ARIMA(2,0,0) with non-zero mean
Coefficients:
ar1 ar2 intercept usconsumption[, 2]
0.1325 0.2924 0.5641 0.2578
s.e. 0.0826 0.0747 0.0883 0.0530
sigma^2 estimated as 0.3538: log likelihood=-145.59
AIC=301.19 AICc=301.57 BIC=316.69
> summary(lm1)
Call:
lm(formula = consumption ~ consumptionLag1 + consumptionLag2 +
income, data = usconsumpdf)
Residuals:
Min 1Q Median 3Q Max
-2.22400 -0.31689 -0.01079 0.34280 1.43839
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.27373 0.08031 3.408 0.000829 ***
consumptionLag1 0.16423 0.07547 2.176 0.031039 *
consumptionLag2 0.21857 0.07198 3.037 0.002800 **
income 0.26670 0.05247 5.082 1.04e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.5952 on 158 degrees of freedom
(2 observations deleted due to missingness)
Multiple R-squared: 0.2853, Adjusted R-squared: 0.2717
F-statistic: 21.02 on 3 and 158 DF, p-value: 1.637e-11
Upvotes: 3
Views: 573
Reputation: 4995
The documentation of arima()
(Arima()
is just a wrapper for arima()
) tells this about the fitting method:
Fitting methods
The exact likelihood is computed via a state-space representation of the ARIMA process, and the innovations and their variance found by a Kalman filter. The initialization of the differenced ARMA process uses stationarity and is based on Gardner et al (1980). ...
Whereas lm
uses Least Squares (QR-Factorization) as mentioned here:
https://stats.stackexchange.com/questions/175983/whats-the-underlying-algorithm-used-by-rs-lm.
In the documentation I found this:
... an optional vector of weights to be used in the fitting process. If specified, weighted least squares is used with weights weights (that is, minimizing sum(w*e^2)); otherwise ordinary least squares is used.
Upvotes: 3