Reputation: 1461
I am trying to use script available in R blogger in below web page on my own data set.
https://www.r-bloggers.com/forecasting-weekly-data/
I have converted my data to time series and then modified script , but got error "Error in ...fourier(x, K, length(x) + (1:h)) : K must be not be greater than period/2" I sharing script which I have created, can anyone help me what kind of error is this.
DataFVM <- read.csv("AM1.csv", header=TRUE,na.strings=c("NULL",""))
Data <- subset(DataFVM,select=c(ID,Backlog))
Data <- Data[(Data$ID %in% c('905')),]
backlog <- as.vector(Data$Backlog)
backlog <- as.ts(backlog)
bestfit <- list(aicc=Inf)
for(i in 1:25)
{
fit <- auto.arima(backlog, xreg=fourier(backlog, K=i), seasonal=FALSE)
if(fit$aicc < bestfit$aicc)
bestfit <- fit
else break;
}
fc <- forecast(bestfit, xreg=fourierf(backlog, K=1, h=104))
Below is data-set which I am using
ID Backlog
905 0.99
905 0.96
905 0.98
905 0.87
905 0.95
905 0.91
905 0.96
905 0.92
905 0.9
905 0.91
905 0.96
905 0.95
905 0.87
905 0.99
905 0.95
905 0.99
905 0.93
905 0.94
905 0.96
905 0.98
905 0.71
905 0.84
905 0.86
905 0.92
905 0.91
905 1
905 0.96
905 0.92
905 0.96
905 0.92
905 0.83
905 0.93
905 0.97
905 0.67
905 0.89
905 0.92
905 0.95
905 0.94
905 0.95
905 1
905 0.98
905 0.94
905 0.88
Upvotes: 0
Views: 1720
Reputation: 164
The reason appears to be that the fourier function expects your time-series data to have a frequency associated with it. So if you believe your data has a seasonality of 52, then change the following line like so:
#backlog <- as.ts(backlog)
backlog <- as.ts(backlog,frequency=52)
Now, fourier understands 52 to be the period. So you can iterate 'i' anywhere from 1 to 25 where > 25 would produce the same error you're receiving now: K must be not be greater than period/2.
I took a quick crack at finding a model to represent the data. Below is a plot of the partial autocorrelations showing a significant lag at 5, 13, and 20.
After looking at a few SARIMAX models, I settled on the following equation which had the minimal MSE in the validation data set:
[0,0,0] [0,0,1]13
Seasonal MA: order 13, X = dummy vector identifying the 2 significant drops, and I included a constant
Below is the performance of a rolling 1 period ahead prediction:
Hope this helps, good luck!
Upvotes: 3