Charlie
Charlie

Reputation:

Why does NSDIFFS (R forecast package) never show seasonality?

I've been using the EViews statconn DCOM interface to loop a large number of series from FRED through the nsdiffs(test=c("ch")) function in the forecast package of R to examine what percent of them require seasonal differencing. However, after a trial run of 26,000+ series, the OCSB test returns 1,245 positive results (output=1), but the Canova Hansen test returns zero, which makes me suspicious of the fact that I'm doing something wrong, but can't figure out exactly what

for !k = 1 to !keriesnumber
    xput(rtype=ts) sertest!k 'pass the series from EViews to R
    xrun library(forecast) 'call necessary forecast library
    xrun sertest!k <- ts(sertest!k, start=c(1990, 1), end=c(2014, 6), frequency=12)  'specify timeseries properties
    xrun kpss_!k<-  ndiffs(sertest!k)   'kpss test
    xrun ch_!k<-nsdiffs(sertest!k,m=frequency(sertest!k),test=c("ch")) 'ch test
    xrun ocsb_!k<-nsdiffs(sertest!k,m=frequency(sertest!k),test=c("ocsb")) 'ocsb test
    xget kpss_!k 'return the three output integers back to eviews
    xget ch_!k
    xget ocsb_!k
    !count=!count+1
    kpss_vector(!count)=kpss_!k 'store the results in a vector
    ch_vector(!count)=ch_!k
    ocsb_vector(!count)=ocsb_!k
next

Am I mis-specifying the frequency or something in the nsdiffs command? If so, why is it (I think correctly?) undertaking the OCSB without issue: ~1/26 seems a reasonable amount of times to not reject the null on a dataset of this type. I'm hoping that as a non-native R user I'm merely just forgetting to call a library or something simple and some kind soul can identify this issue:). Otherwise, it seems strange that no macroeconomic series out of the 26000 i tested would require seasonal differencing under the CH test. I also tried a similar routine with some of the data from the original CH paper (http://www.ssc.wisc.edu/~bhansen/progs/jbes_95.html) but still couldnt manage to find the hallowed output=1.

Upvotes: 1

Views: 4055

Answers (1)

javlacalle
javlacalle

Reputation: 1049

Taking the series "wage" used in the applications shown in the reference paper, the value 1 is returned based on the Canova and Hansen test (i.e., non-stable seasonal cycles suggesting the need for seasonal differencing):

require(forecast)
wage <- structure(c(2.32, 2.35, 2.38, 2.4, 2.41, 2.44, 2.47, 2.5, 2.52, 
2.55, 2.58, 2.61, 2.63, 2.66, 2.7, 2.73, 2.78, 2.82, 2.87, 2.92, 
2.96, 3.02, 3.07, 3.12, 3.15, 3.2, 3.26, 3.3, 3.36, 3.43, 3.49, 
3.53, 3.61, 3.66, 3.72, 3.79, 3.83, 3.9, 3.98, 4.05, 4.09, 4.18, 
4.29, 4.39, 4.42, 4.48, 4.57, 4.66, 4.73, 4.8, 4.91, 5.01, 5.1, 
5.19, 5.29, 5.4, 5.5, 5.63, 5.74, 5.89, 6, 6.07, 6.21, 6.34, 
6.46, 6.57, 6.7, 6.9, 7.06, 7.17, 7.31, 7.44, 7.55, 7.62, 7.71, 
7.81, 7.91, 7.96, 8.02, 8.15, 8.25, 8.29, 8.35, 8.43, 8.51, 8.54, 
8.59, 8.68), .Tsp = c(1964, 1985.75, 4), class = "ts")
x <- diff(log(wage))
nsdiffs(x, frequency(x), test = "ch")
# [1] 1

The value of the test statistic for this series is 1.11, close to 1.14 reported in the reference paper. Differences may arise due different choices of the lag truncation parameter when computing the covariance matrix. (The example below requires copying and pasting the functions SeasDummy and SD.test from the source files of the forecast package as these functions are not exported.)

SD.test(x, 4)
[1] 1.112811

When working with other series, remember filtering possible unit roots at frequency zero and applying the Canova and Hansen test on the transformed series (if required), for example on the first differences of the data.

Edit

I did not find in the documentation of auto.arima the statement mentioned by the OP but looking at the source code it seems that the number of seasonal differences is first determined and afterwards a test to choose the number of regular differences is applied.

I am not familiar with the OCSB test, which is the default test in nsdiffs. For this test, it may be possible that a unit root at frequency zero may be present in the data when applying the test (as it is the case in the HEGY test which have same null hypothesis). However, if the Canova and Hansen test is used, it is advisable to transform the data removing possible unit roots at the zero frequency.

You can change the order in which d and D are selected by running first ndiffs and transforming the data accordingly. Then you can run nsdiffs on the transformed series with the option test="ch". In this way, you can pass the arguments d and D to auto.arima.

Upvotes: 1

Related Questions