Reputation:
I've been using the EViews statconn DCOM interface to loop a large number of series from FRED through the nsdiffs(test=c("ch")) function in the forecast package of R to examine what percent of them require seasonal differencing. However, after a trial run of 26,000+ series, the OCSB test returns 1,245 positive results (output=1), but the Canova Hansen test returns zero, which makes me suspicious of the fact that I'm doing something wrong, but can't figure out exactly what
for !k = 1 to !keriesnumber
xput(rtype=ts) sertest!k 'pass the series from EViews to R
xrun library(forecast) 'call necessary forecast library
xrun sertest!k <- ts(sertest!k, start=c(1990, 1), end=c(2014, 6), frequency=12) 'specify timeseries properties
xrun kpss_!k<- ndiffs(sertest!k) 'kpss test
xrun ch_!k<-nsdiffs(sertest!k,m=frequency(sertest!k),test=c("ch")) 'ch test
xrun ocsb_!k<-nsdiffs(sertest!k,m=frequency(sertest!k),test=c("ocsb")) 'ocsb test
xget kpss_!k 'return the three output integers back to eviews
xget ch_!k
xget ocsb_!k
!count=!count+1
kpss_vector(!count)=kpss_!k 'store the results in a vector
ch_vector(!count)=ch_!k
ocsb_vector(!count)=ocsb_!k
next
Am I mis-specifying the frequency or something in the nsdiffs command? If so, why is it (I think correctly?) undertaking the OCSB without issue: ~1/26 seems a reasonable amount of times to not reject the null on a dataset of this type. I'm hoping that as a non-native R user I'm merely just forgetting to call a library or something simple and some kind soul can identify this issue:). Otherwise, it seems strange that no macroeconomic series out of the 26000 i tested would require seasonal differencing under the CH test. I also tried a similar routine with some of the data from the original CH paper (http://www.ssc.wisc.edu/~bhansen/progs/jbes_95.html) but still couldnt manage to find the hallowed output=1.
Upvotes: 1
Views: 4055
Reputation: 1049
Taking the series "wage" used in the applications shown in the reference paper, the value 1 is returned based on the Canova and Hansen test (i.e., non-stable seasonal cycles suggesting the need for seasonal differencing):
require(forecast)
wage <- structure(c(2.32, 2.35, 2.38, 2.4, 2.41, 2.44, 2.47, 2.5, 2.52,
2.55, 2.58, 2.61, 2.63, 2.66, 2.7, 2.73, 2.78, 2.82, 2.87, 2.92,
2.96, 3.02, 3.07, 3.12, 3.15, 3.2, 3.26, 3.3, 3.36, 3.43, 3.49,
3.53, 3.61, 3.66, 3.72, 3.79, 3.83, 3.9, 3.98, 4.05, 4.09, 4.18,
4.29, 4.39, 4.42, 4.48, 4.57, 4.66, 4.73, 4.8, 4.91, 5.01, 5.1,
5.19, 5.29, 5.4, 5.5, 5.63, 5.74, 5.89, 6, 6.07, 6.21, 6.34,
6.46, 6.57, 6.7, 6.9, 7.06, 7.17, 7.31, 7.44, 7.55, 7.62, 7.71,
7.81, 7.91, 7.96, 8.02, 8.15, 8.25, 8.29, 8.35, 8.43, 8.51, 8.54,
8.59, 8.68), .Tsp = c(1964, 1985.75, 4), class = "ts")
x <- diff(log(wage))
nsdiffs(x, frequency(x), test = "ch")
# [1] 1
The value of the test statistic for this series is 1.11, close to 1.14 reported in the reference paper. Differences may arise due different choices of the lag truncation parameter when computing the covariance matrix. (The example below requires copying and pasting the functions SeasDummy
and SD.test
from the source files of the forecast
package as these functions are not exported.)
SD.test(x, 4)
[1] 1.112811
When working with other series, remember filtering possible unit roots at frequency zero and applying the Canova and Hansen test on the transformed series (if required), for example on the first differences of the data.
Edit
I did not find in the documentation of auto.arima
the statement mentioned by the OP but looking at the source code it seems that the number of seasonal differences is first determined and afterwards a test to choose the number of regular differences is applied.
I am not familiar with the OCSB test, which is the default test in nsdiffs
. For this test, it may be possible that a unit root at frequency zero may be present in the data when applying the test (as it is the case in the HEGY test which have same null hypothesis). However, if the Canova and Hansen test is used, it is advisable to transform the data removing possible unit roots at the zero frequency.
You can change the order in which d
and D
are selected by running first ndiffs
and transforming the data accordingly. Then you can run nsdiffs
on the transformed series with the option test="ch"
. In this way, you can pass the arguments d
and D
to auto.arima
.
Upvotes: 1