Reputation: 43
My data is timeseries :
library(forecast)
library(Mcomp)
ts.data<- subset(M3, 12)[[551]]
print(ts.data)
Output:
Series: M551
Type of series: INDUSTRY
Period of series: MONTHLY
Series description: Lumber, softwoods, western pine, production
HISTORICAL data
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1983 5960 6320 7050 7080 7170 7620 7000 7790 7830 7580 6620 6860
1984 7240 6550 8730 8690 7650 7950 7210 7880 7040 7940 7290 5750
1985 6410 6230 7260 8220 7770 7830 7590 9210 8340 8660 7330 7000
1986 7500 7430 8350 9010 8470 8900 8720 9710 10070 9610 8410 8640
1987 8500 8920 10470 8960 9390 10630 9380 10050 9030 10580 9350 8810
1988 9300 10430 10570 10440 10120 8780 7460 8230 9830 10260 9270 9260
1989 9260 8150 9930 8840 9150 10230 9340 10170 9150 10420 8690 8960
1990 9960 9050 10430 9650 9350 8790 8550 8790 7590 8730 7520 6110
1991 7450 7230 7680 8270 8060 8210 8260 8910 8540 8180 7430 6880
1992 7360 7560 8800 7550 7900 8320 7430
FUTURE data
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1992 7650 7460 8580 7300 6530
1993 7070 6940 7060 7470 6190 6310 7090 7310 6310 7270 7240 6410
1994 7470
How can i build ARIMA model and then find the Mean Absolute Error (MAE) of the model ?
Any thoughts to start would be helpfull.
Upvotes: 1
Views: 338
Reputation: 5788
Here's one way you can do it (Note: to do this properly you should also investigate ACF/PACF prior to modelling as well as perform cross-validation after producing the initial models):
# Install pacakges if they are not already installed: necessary_packages => vector
necessary_packages <- c("forecast", "Mcomp")
# Create a vector containing the names of any packages needing installation:
# new_pacakges => vector
new_packages <- necessary_packages[!(necessary_packages %in%
installed.packages()[, "Package"])]
# If the vector has more than 0 values, install the new pacakges
# (and their) associated dependencies:
if(length(new_packages) > 0){install.packages(new_packages, dependencies = TRUE)}
# Initialise the packages in the session: list of boolean => stdout (console)
lapply(necessary_packages, require, character.only = TRUE)
# Subset the M3 data to contain the relevant series:
ts.data <- subset(M3, 12)[[551]]
# Extract the historical data:
historical_production_ts <- ts.data[["x"]]
# Set the seed for reproducibility:
set.seed(2020)
# It is clear from this series we have trend and seasonality:
decomposed_ts <- decompose(historical_production_ts)
# Chart the decomposed series:
plot(decomposed_ts)
# Note: we have to make series stationary in order to compute
# an ARIMA model, to do this we need to account for trend and
# seasonality:
# Use a unit root test to determine the number of differences required
# to make the trend of the series stationary:
trend_required_diffs <- ndiffs(decomposed_ts$trend)
# Use a unit root test to determin the number of differences required
# to make the seasonality of the series stationary:
seasonality_required_diffs <- ndiffs(decomposed_ts$seasonal)
# Create an auto-arima model and store the result in the fit variable:
fit <- auto.arima(historical_production_ts,
# Account for trend:
d = trend_required_diffs,
# Account for seasonality:
D = seasonality_required_diffs,
# Try out alot of different models:
stepwise = FALSE,
# Don't approximate the AIC:
approximation = FALSE)
# Check the Mean Absolute Error (MAE) of the model:
data.frame(accuracy(fit))[,"MAE"]
# Forecast out the number of future observations required:
aa_fcast <- forecast(fit, length(ts.data$xx))
# Chart the forecast:
autoplot(aa_fcast) +
autolayer(ts.data$xx, series = "Actual Production (Future)") +
autolayer(aa_fcast$mean, series = "Forecasts")
# A function to calculate the MAE:
mae <- function(actual, pred){
mean(abs(pred-actual))
}
# Calculate the accuracy of the forecast against the future data:
mean_abs_error <- mae(ts.data$xx, data.frame(aa_fcast)[,"Point.Forecast", drop = TRUE])
Upvotes: 2