Reputation: 435
I ran this script as part of a forecasting project for school, but I got some odd results especially with the MAPE values. What it's supposed to do is predict international terrorism incident for the next 12 months. Can anyone tell me if this report is accurate or if I missed something? I tried to include the graphs, but I don't think they can be posted here.
Thanks
library(ggplot2)
library(forecast)
library(tseries)
library(reprex)
terror <- tibble::tribble(
~imonth, ~iyear, ~monthly,
1, 2015, 1534,
2, 2015, 1295,
3, 2015, 1183,
4, 2015, 1277,
5, 2015, 1316,
6, 2015, 1168,
7, 2015, 1263,
8, 2015, 1290,
9, 2015, 1107,
10, 2015, 1269,
11, 2015, 1172,
12, 2015, 1091,
1, 2016, 1162,
2, 2016, 1153,
3, 2016, 1145,
4, 2016, 1120,
5, 2016, 1353,
6, 2016, 1156,
7, 2016, 1114,
8, 2016, 1162,
9, 2016, 1045,
10, 2016, 1140,
11, 2016, 1114,
12, 2016, 923,
1, 2017, 879,
2, 2017, 879,
3, 2017, 961,
4, 2017, 856,
5, 2017, 1081,
6, 2017, 1077,
7, 2017, 994,
8, 2017, 968,
9, 2017, 838,
10, 2017, 805,
11, 2017, 804,
12, 2017, 749
)
# aggregated data
terror_byMonth_Train = ts(data = terror$monthly,
start = c(2015,1),
end = c(2016,12),
frequency=12)
terror_byMonth_Test = ts(data = terror$monthly,
start = c(2017,1),
end = c(2017,12),
frequency=12)
# arima instead of exp smooth
m_arima <- auto.arima(terror_byMonth_Train)
#> Warning in value[[3L]](cond): The chosen test encountered an error, so no
#> seasonal differencing is selected. Check the time series data.
# fit exp smooth model
m_ets = ets(terror_byMonth_Train)
# Get length of terror_byMonth_Test set
size <- length(terror_byMonth_Test)
# forecast for 2017 using multiple forecast (Davis Style)
f_arima_multi <- m_arima %>%
forecast(h = size)
f_arima_multi %>%
autoplot()
# forecast ARIMA 2017 (Orininal Style)
f_arima<-forecast(m_arima,h=12)
f_arima %>%
autoplot()
# forecast ETS 2017
f_ets = forecast(m_ets, h=12)
f_ets %>%
autoplot()
# check accuracy ETS
acc_ets <- accuracy(m_ets)
#check accuracy ARIMA, between train and test sets
acc_arima_TrainVSTest <- accuracy(f_arima_multi, x = terror_byMonth_Test)
# check accuarcy ARIMA
acc_arima <- accuracy(f_arima)
# MAPE(ETS)= 20.03 < MAPE(ARIMA) = 22.05
# ETS model chosen
# Compare to 2017 data
accuracy(f_ets, terror_byMonth_Test)
#> ME RMSE MAE MPE MAPE MASE
#> Training set -14.30982 90.08823 70.06438 -1.606862 5.900178 0.5790445
#> Test set 303.53575 316.03133 303.53575 23.986363 23.986363 2.5085599
#> ACF1 Theil's U
#> Training set 0.0008690031 NA
#> Test set -0.2148651254 2.356116
Created on 2019-02-13 by the reprex package (v0.2.1)
Upvotes: 2
Views: 135
Reputation: 48211
The issue is in how you defined terror_byMonth_Test
. It should be, e.g.,
terror_byMonth_Test <- ts(data = tail(terror$monthly, 12),
start = c(2017, 1),
end = c(2017, 12),
frequency = 12)
That is, simply providing start and end dates isn't enough for ts
to know which 12 observations out of 24 in terror$monthly
to take. This reduces MAPE to 10.4%.
Upvotes: 1