user3734568
user3734568

Reputation: 1461

Forecast Time Series for weekly data

I am trying to use script available in R blogger in below web page on my own data set.

https://www.r-bloggers.com/forecasting-weekly-data/

I have converted my data to time series and then modified script , but got error "Error in ...fourier(x, K, length(x) + (1:h)) : K must be not be greater than period/2" I sharing script which I have created, can anyone help me what kind of error is this.

DataFVM <- read.csv("AM1.csv", header=TRUE,na.strings=c("NULL",""))

Data <- subset(DataFVM,select=c(ID,Backlog))
Data <- Data[(Data$ID %in% c('905')),]
backlog <- as.vector(Data$Backlog)
backlog <- as.ts(backlog)

bestfit <- list(aicc=Inf)
for(i in 1:25)
{
  fit <- auto.arima(backlog, xreg=fourier(backlog, K=i), seasonal=FALSE)
  if(fit$aicc < bestfit$aicc)
    bestfit <- fit
  else break;
}
fc <- forecast(bestfit, xreg=fourierf(backlog, K=1, h=104))

Below is data-set which I am using

ID  Backlog
905 0.99
905 0.96
905 0.98
905 0.87
905 0.95
905 0.91
905 0.96
905 0.92
905 0.9
905 0.91
905 0.96
905 0.95
905 0.87
905 0.99
905 0.95
905 0.99
905 0.93
905 0.94
905 0.96
905 0.98
905 0.71
905 0.84
905 0.86
905 0.92
905 0.91
905 1
905 0.96
905 0.92
905 0.96
905 0.92
905 0.83
905 0.93
905 0.97
905 0.67
905 0.89
905 0.92
905 0.95
905 0.94
905 0.95
905 1
905 0.98
905 0.94
905 0.88

Upvotes: 0

Views: 1720

Answers (1)

Data Junki
Data Junki

Reputation: 164

The reason appears to be that the fourier function expects your time-series data to have a frequency associated with it. So if you believe your data has a seasonality of 52, then change the following line like so:

#backlog <- as.ts(backlog)
backlog <- as.ts(backlog,frequency=52)

Now, fourier understands 52 to be the period. So you can iterate 'i' anywhere from 1 to 25 where > 25 would produce the same error you're receiving now: K must be not be greater than period/2.

I took a quick crack at finding a model to represent the data. Below is a plot of the partial autocorrelations showing a significant lag at 5, 13, and 20.

Partial Autocorrelation

After looking at a few SARIMAX models, I settled on the following equation which had the minimal MSE in the validation data set:

[0,0,0] [0,0,1]13

Seasonal MA: order 13, X = dummy vector identifying the 2 significant drops, and I included a constant

Below is the performance of a rolling 1 period ahead prediction:

Validation Data Results

Hope this helps, good luck!

Upvotes: 3

Related Questions