Rick Arko
Rick Arko

Reputation: 680

Combining Forecasts Results into a Cohesive Data Frame in R

Loop to Compute and Store auto.arima() and forecast() Results in Dataframe

A small sample of my dataframe with random data can be generated with the following

df <- data.frame(col1 = runif(24, 400, 700),
                  col2 = runif(24, 350, 600),
                  col3 = runif(24, 600, 940),
                  col4 = runif(24, 2000, 2600),
                  col5 = runif(24, 950, 1200))


colnames(df) <- c("NorthHampton to EastHartford", "NorthHampton to Edison", 
                  "NorthHampton to Yonkers", "North Hampton to Brooklyn", "NorthHampton to Rotterdam" )

I'm trying to run a series of ARIMA models using auto.arima() in R and having difficulty generating my output in the desired format. A sample section of where I had started is below.

ts <- ts(df, frequency = 12, start = c(2014, 1), end = c(2015, 12))

model  <- list()
results <- list()


for (i in 1:ncol(ts)) {
  fit <- auto.arima(ts[,i], stepwise = F, approximation = F)
  model <- forecast(fit)$method
  results <- forecast(fit, h = 3)$mean
  
#   print(forecast(fit)$method)
#   print(forecast(fit, h=3)$mean)

  }

Ideally I want my loop to populate a data.frame that will be formatted like this:

Lane                                 Model                          Time    PointEstimate
Northampton to East Hartford    "ARIMA(0,0,0) with non-zero mean"   Jan-16                  
Northampton to East Hartford    "ARIMA(0,0,0) with non-zero mean"   Feb-16                  
Northampton to East Hartford    "ARIMA(0,0,0) with non-zero mean"   Mar-16                  
Northampton to Edison           "ARIMA(0,0,0) with non-zero mean"   Jan-16                  
Northampton to Edison           "ARIMA(0,0,0) with non-zero mean"   Feb-16                  
Northampton to Edison           "ARIMA(0,0,0) with non-zero mean"   Mar-16                  
Northampton to Yonkers          "ARIMA(0,0,0) with non-zero mean"   Jan-16                  

The results for Column Lane should be the same as the column name from the original dataframe. The results for Model are the results from forecast(fit)$method, and the the Point Estimate should be the result of forecast(fit, h = 3)$mean, where each item is repeated in the dataframe h times (3) in this case.

I think my loop is performing the calculations that I need I just can't figure out how to store the results, and then append the results for the next iteration through the end of the loop. I appreciate any help I can get on this.

Upvotes: 2

Views: 2160

Answers (3)

eduardo andraders
eduardo andraders

Reputation: 1

fix line time with this code

Date = as.yearmon(time(forecast(fit, h)$mean)),

Upvotes: 0

alexwhitworth
alexwhitworth

Reputation: 4907

Learn how to put together data.frames and the str of the items you work with. This is a relatively simple exercise.

library(forecast)
library(data.table)

combine_ts <- function(df, h=3, frequency= 12, start= c(2014,1), end=c(2015,12)) {
  results <- list()
  ts <- ts(df, frequency = frequency, start = start, end = end)

  for (i in 1:ncol(ts)) {
    fit <- auto.arima(ts[,i], stepwise = F, approximation = F)

    results[[i]] <- data.frame(Lane= rep(colnames(ts)[i], h),
                               Model= rep(forecast(fit)$method, h),
                               Date= format(as.Date(time(forecast(fit, h)$mean)), "%b-%y"),
                               PointEstimate= forecast(fit, h=h)$mean)

  }  
  return(data.table::rbindlist(results)) 
}

R> combine_ts(df)
                            Lane                           Model   Date PointEstimate
 1: NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean Jan-16      536.1760
 2: NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean Feb-16      536.1760
 3: NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean Mar-16      536.1760
 4:       NorthHampton to Edison ARIMA(1,0,0) with non-zero mean Jan-16      488.9687
 5:       NorthHampton to Edison ARIMA(1,0,0) with non-zero mean Feb-16      498.8986
 6:       NorthHampton to Edison ARIMA(1,0,0) with non-zero mean Mar-16      502.4015
 7:      NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean Jan-16      764.8654
 8:      NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean Feb-16      764.8654
 9:      NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean Mar-16      764.8654
10:    North Hampton to Brooklyn ARIMA(0,0,0) with non-zero mean Jan-16     2304.5727
11:    North Hampton to Brooklyn ARIMA(0,0,0) with non-zero mean Feb-16     2304.5727
12:    North Hampton to Brooklyn ARIMA(0,0,0) with non-zero mean Mar-16     2304.5727
13:    NorthHampton to Rotterdam ARIMA(0,0,0) with non-zero mean Jan-16     1094.5927
14:    NorthHampton to Rotterdam ARIMA(0,0,0) with non-zero mean Feb-16     1094.5927
15:    NorthHampton to Rotterdam ARIMA(0,0,0) with non-zero mean Mar-16     1094.5927

Upvotes: 2

HubertL
HubertL

Reputation: 19544

You can try something like:

library(forecast)
fits <- lapply(1:ncol(ts),  function(i) auto.arima(ts[,i], stepwise = F, approximation = F))
models <- sapply(1:ncol(ts), function(i) forecast(fits[[i]])$method)
results <- lapply(1:ncol(ts), function(i) forecast(fits[[i]], h = 3)$mean)

resultsdf <- data.frame(do.call(rbind, results))
colnames(resultsdf) <- format(as.Date(time(results[[1]])), "%b-%y")
resultsdf$Lane=colnames(df)
resultsdf$Model=models

library(reshape2)
res <- melt(resultsdf, id.vars=4:5, measure.vars=1:3, variable;name = "Time",value;name = "PointEstimate")

                           Lane                           Model variable     value
1  NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean janv.-16  546.9441
2        NorthHampton to Edison ARIMA(0,0,0) with non-zero mean janv.-16  487.6225
3       NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean janv.-16  778.9514
4     North Hampton to Brooklyn ARIMA(1,0,0) with non-zero mean janv.-16 2459.3983
5     NorthHampton to Rotterdam ARIMA(1,0,0) with non-zero mean janv.-16 1098.1912
6  NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean févr.-16  546.9441
7        NorthHampton to Edison ARIMA(0,0,0) with non-zero mean févr.-16  487.6225
8       NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean févr.-16  778.9514
9     North Hampton to Brooklyn ARIMA(1,0,0) with non-zero mean févr.-16 2416.4848
10    NorthHampton to Rotterdam ARIMA(1,0,0) with non-zero mean févr.-16 1077.3921
11 NorthHampton to EastHartford ARIMA(0,0,0) with non-zero mean  mars-16  546.9441
12       NorthHampton to Edison ARIMA(0,0,0) with non-zero mean  mars-16  487.6225
13      NorthHampton to Yonkers ARIMA(0,0,0) with non-zero mean  mars-16  778.9514
14    North Hampton to Brooklyn ARIMA(1,0,0) with non-zero mean  mars-16 2397.1000
15    NorthHampton to Rotterdam ARIMA(1,0,0) with non-zero mean  mars-16 1085.3332

Upvotes: 2

Related Questions