How to transform time series code to automated code to use for multiple time series?

Question

I want to transform my time series code for one time series to an automated code which can be used for multiple time series data (my data contains a monthly time series). My general approach for one time series was to remove the seasonal component and take first differences to achieve stationarity. Then I use auto.arima to get the ARIMA parameters. I use these parameters to build my ARIMA model with my original time series data. Then I forecast and compare to the actual data of 4 months (which I have cut out before) and calculate the RMSE. As I cannot use my actual data, I just generate a random time series and test set as an example - of course the outcome does not make much sense.

library('forecast')
set.seed(123)

# create random time series and 4 months testing data
ts <- ts(runif(26, min = 50, max = 3000), start = c(2017,01), end = c(2019,02), frequency = 12)
test.data <- runif(4, min = 50, max = 3000)

# Decomompose
comp.ts = decompose(ts)

# subtrect seasonal trend
ts2 <- ts - comp.ts$seasonal
ts2 <- diff(ts2, differences=1)

auto.arima(ts2, trace = T, seasonal = TRUE,ic = 'aicc', max.p = 10,max.q = 10,max.P = 10,max.Q = 10,max.d = 10, stepwise = F)

# Use auto.arima outcome as input
my.arima <- Arima(ts2, order=c(0,0,0),seasonal = list(order = c(0,1,0), period = 12),method="ML", include.drift = F)
# Forecast and calculate RMSE
data.forecast <- forecast(my.arima, h=4, level=c(99.5))
my.difference <- test.data - data.forecast$mean
my.rmse <- (sum(sqrt(my.difference^2)))/length(my.difference)

As my actual data set contains over 500 time series, I need to automate the whole process. Unfortunately, I have not used R for time series so far, so I have problems coming up with an automated process.

Lets assume 4 random time series with 4 random test sets. How could I generate an automated process for these time series (which I can also use for my actual 500+ time series) which does the exact same thing as above?

ts1 <- ts(runif(26, min = 50, max = 3000), start = c(2017,01), end = c(2019,02), frequency = 12)
ts2 <- ts(runif(26, min = 50, max = 3000), start = c(2017,01), end = c(2019,02), frequency = 12)
ts3 <- ts(runif(26, min = 50, max = 3000), start = c(2017,01), end = c(2019,02), frequency = 12)
ts4 <- ts(runif(26, min = 50, max = 3000), start = c(2017,01), end = c(2019,02), frequency = 12)
test.data1 <- runif(4, min = 50, max = 3000)
test.data2 <- runif(4, min = 50, max = 3000)
test.data3 <- runif(4, min = 50, max = 3000)
test.data4 <- runif(4, min = 50, max = 3000)

Thanks for the help!

jay.sf · Accepted Answer

Just put your workflow into a function.

serialArima <- function(ts, test.data) {
  library(forecast)
  # Decomompose
  comp.ts=decompose(ts)

  # subtrect seasonal trend
  ts2 <- ts - comp.ts$seasonal
  ts2 <- diff(ts2, differences=1)

  auto.arima(ts2, trace=T, seasonal=TRUE, ic='aicc', max.p=0, max.q=0, max.P=0,
             max.Q=0, max.d=0, stepwise=F)

  # Use auto.arima outcome as input
  my.arima <- Arima(ts2, order=c(0, 0, 0),
                    seasonal=list(order=c(0, 1, 0), period=2), 
                    method="ML", include.drift=F)
  # Forecast and calculate RMSE
  data.forecast <- forecast(my.arima, h=4, level=c(99.5))
  my.difference <- test.data - data.forecast$mean
  my.rmse <- (sum(sqrt(my.difference^2)))/length(my.difference)
  return(list(data.forecast=data.forecast, my.difference=my.difference, my.rmse=my.rmse))
}

Singular application

serialArima(ts, test.data)
# ARIMA(0,0,0)           with zero mean     : 82.45803
# ARIMA(0,0,0)           with non-zero mean : 88.13593
# 
# 
# 
# Best model: ARIMA(0,0,0)           with zero mean     
# 
# $data.forecast
# Point Forecast   Lo 99.5  Hi 99.5
# 2020.00      -349.1424 -2595.762 1897.477
# 2020.50       772.6014 -1474.018 3019.221
# 2021.00      -349.1424 -3526.342 2828.057
# 2021.50       772.6014 -2404.598 3949.801
# 
# $my.difference
# Time Series:
#   Start = c(2020, 1) 
# End = c(2021, 2) 
# Frequency = 2 
# [1] 1497.2446  840.4139 2979.4553  993.5614
# 
# $my.rmse
# [1] 1577.669

Multiple application

Map(serialArima, list(ts1, ts2, ts3, ts4), 
    list(test.data1, test.data2, test.data3, test.data4))

How to transform time series code to automated code to use for multiple time series?

Answers (1)

Singular application

Multiple application

Related Questions