Rice Man
Rice Man

Reputation: 435

forecast model giving odd MAPE values, can some one please tell me if this is correct?

I ran this script as part of a forecasting project for school, but I got some odd results especially with the MAPE values. What it's supposed to do is predict international terrorism incident for the next 12 months. Can anyone tell me if this report is accurate or if I missed something? I tried to include the graphs, but I don't think they can be posted here.

Thanks

library(ggplot2)
library(forecast)
library(tseries)
library(reprex)

terror <- tibble::tribble(
  ~imonth, ~iyear, ~monthly,
  1,   2015,     1534,
  2,   2015,     1295,
  3,   2015,     1183,
  4,   2015,     1277,
  5,   2015,     1316,
  6,   2015,     1168,
  7,   2015,     1263,
  8,   2015,     1290,
  9,   2015,     1107,
  10,   2015,     1269,
  11,   2015,     1172,
  12,   2015,     1091,
  1,   2016,     1162,
  2,   2016,     1153,
  3,   2016,     1145,
  4,   2016,     1120,
  5,   2016,     1353,
  6,   2016,     1156,
  7,   2016,     1114,
  8,   2016,     1162,
  9,   2016,     1045,
  10,   2016,     1140,
  11,   2016,     1114,
  12,   2016,      923,
  1,   2017,      879,
  2,   2017,      879,
  3,   2017,      961,
  4,   2017,      856,
  5,   2017,     1081,
  6,   2017,     1077,
  7,   2017,      994,
  8,   2017,      968,
  9,   2017,      838,
  10,   2017,      805,
  11,   2017,      804,
  12,   2017,      749
)

# aggregated data
terror_byMonth_Train = ts(data = terror$monthly,
                          start = c(2015,1), 
                          end = c(2016,12),
                          frequency=12) 

terror_byMonth_Test = ts(data = terror$monthly,
                         start = c(2017,1), 
                         end = c(2017,12),
                         frequency=12)
# arima instead of exp smooth
m_arima <- auto.arima(terror_byMonth_Train)
#> Warning in value[[3L]](cond): The chosen test encountered an error, so no
#> seasonal differencing is selected. Check the time series data.

# fit exp smooth model
m_ets = ets(terror_byMonth_Train) 
# Get length of terror_byMonth_Test set
size <- length(terror_byMonth_Test)

# forecast for 2017 using multiple forecast (Davis Style)
f_arima_multi <- m_arima %>%
  forecast(h = size)

f_arima_multi %>%
  autoplot()

# forecast ARIMA 2017 (Orininal Style)
f_arima<-forecast(m_arima,h=12)
f_arima %>%
  autoplot()

# forecast ETS 2017
f_ets = forecast(m_ets, h=12) 
f_ets %>%
  autoplot()

# check accuracy ETS
acc_ets <- accuracy(m_ets) 

#check accuracy ARIMA, between train and test sets
acc_arima_TrainVSTest <- accuracy(f_arima_multi, x = terror_byMonth_Test)

# check accuarcy ARIMA
acc_arima <- accuracy(f_arima)

# MAPE(ETS)= 20.03 < MAPE(ARIMA) = 22.05
# ETS model chosen 

# Compare to 2017 data
accuracy(f_ets, terror_byMonth_Test)
#>                     ME      RMSE       MAE       MPE      MAPE      MASE
#> Training set -14.30982  90.08823  70.06438 -1.606862  5.900178 0.5790445
#> Test set     303.53575 316.03133 303.53575 23.986363 23.986363 2.5085599
#>                       ACF1 Theil's U
#> Training set  0.0008690031        NA
#> Test set     -0.2148651254  2.356116

Created on 2019-02-13 by the reprex package (v0.2.1)

Upvotes: 2

Views: 135

Answers (1)

Julius Vainora
Julius Vainora

Reputation: 48211

The issue is in how you defined terror_byMonth_Test. It should be, e.g.,

terror_byMonth_Test <- ts(data = tail(terror$monthly, 12),
                          start = c(2017, 1), 
                          end = c(2017, 12),
                          frequency = 12)

That is, simply providing start and end dates isn't enough for ts to know which 12 observations out of 24 in terror$monthly to take. This reduces MAPE to 10.4%.

Upvotes: 1

Related Questions