Reputation: 1686
I have one time series, let's say
694 281 479 646 282 317 790 591 573 605 423 639 873 420 626 849 596 486 578 457 465 518 272 549 437 445 596 396 259 390
Now, I want to forecast the following values by ARIMA Model, but ARIMA requires the time series to be stationarity, so before this, I have to identify the time series above matches the requirement or not, then fUnitRoots comes up.
I think http://cran.r-project.org/web/packages/fUnitRoots/fUnitRoots.pdf can offer some help, but there is no simple tutorial
I just want one small demo to show how to identify one time series, is there any one?
thanks in advance.
Upvotes: 4
Views: 6484
Reputation: 15458
I will give example using urca
package in R
.
library(urca)
data(npext) # This is the data used by Nelson and Plosser (1982)
sample.data<-npext
head(sample.data)
year cpi employmt gnpdefl nomgnp interest indprod gnpperca realgnp wages realwag sp500 unemploy velocity M
1 1860 3.295837 NA NA NA NA -0.1053605 NA NA NA NA NA NA NA NA
2 1861 3.295837 NA NA NA NA -0.1053605 NA NA NA NA NA NA NA NA
3 1862 3.401197 NA NA NA NA -0.1053605 NA NA NA NA NA NA NA NA
4 1863 3.610918 NA NA NA NA 0.0000000 NA NA NA NA NA NA NA NA
5 1864 3.871201 NA NA NA NA 0.0000000 NA NA NA NA NA NA NA NA
6 1865 3.850148 NA NA NA NA 0.0000000 NA NA NA NA NA NA NA NA
I will use ADF
to perform the unit root test on industrial production index
as an illustration. The lag
is selected based on the SIC
. I use trend as there is trend in the date .
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################
Test regression trend
Call:
lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
Residuals:
Min 1Q Median 3Q Max
-0.31644 -0.04813 0.00965 0.05252 0.20504
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.052208 0.017273 3.022 0.003051 **
z.lag.1 -0.176575 0.049406 -3.574 0.000503 ***
tt 0.007185 0.002061 3.486 0.000680 ***
z.diff.lag 0.124320 0.089153 1.394 0.165695
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.09252 on 123 degrees of freedom
Multiple R-squared: 0.09796, Adjusted R-squared: 0.07596
F-statistic: 4.452 on 3 and 123 DF, p-value: 0.005255
Value of test-statistic is: -3.574 11.1715 6.5748
Critical values for test statistics:
1pct 5pct 10pct
tau3 -3.99 -3.43 -3.13
phi2 6.22 4.75 4.07
phi3 8.43 6.49 5.47
#Interpretation: BIC
selects the lag 1
as optimal lag. The test statistics -3.574
is less than the critical value tau3
at 5 percent (-3.430). So, the null that there is an unit root is is rejected only at 5 percent
.
Also, check the free forecasting book available here
Upvotes: 4
Reputation: 678
You can, of course, carry out formal tests such as the ADF test, but I would suggest carrying out "informal tests" of stationarity as a first step.
Inspecting the data visually using plot()
will help you identify whether or not the data is stationary.
The next step would be to investigate the autocorrelation function and partial autocorrelation function of the data. You can do this by calling both the acf()
and pacf()
functions. This will not only help you decide whether or not the data is stationary, but it will also help you identify tentative ARIMA models that can later be estimated and used for forecasting if they get the all clear after carrying out the necessary diagnostic checks.
You should, indeed, pay caution to the fact that there are only 30 observations in the data that you provided. This falls below the practical minimum level of about 50 observations necessary for forecasting using ARIMA models.
If it helps, a moment after I plotted the data, I was almost certain the data was probably stationary. The estimated acf and pacf seem to confirm this view. Sometimes informal tests like that suffice.
This little-book-of-r-for-time-series may help you further.
Upvotes: 1