Reputation: 751
I implemented a SARIMA model in R and decided to used a plot in order to visually investigate the performance of my model. Here is my code:
library(forecast)
Number_of_visitor <- runif(396, 24000,6600000)
tsdata_trainig <- ts(Number_of_visitor[157:348], frequency = 12, start = c(2003,1))
arima.model <- Arima(tsdata_trainig,order=c(0, 1, 1),
seasonal=list(order=c(0, 1, 1), period=12))
forecast_2019 <- predict(arima.model, n.ahead = 12)$pred # Predictions for 2019 to test my model
tsdata_2019_22_observed <- ts(Number_of_visitor[349:396], frequency = 12, start = c(2019,1)) # Observed data from 2019 to 2022
tsdata_post_pandemic_forecast <- ts(predict(arima.model, n.ahead = 48)$pred[13:48],frequency = 12, start = c(2020,1)) # Predictions for 2020, 2021 and 2022
#Code for plot
ts.plot(forecast_2019, tsdata_2019_22_observed, log = "y", lty = c(1,3), xlab="Year", ylab="Number of Visitors")
lines(tsdata_post_pandemic_forecast,col="red")
Here is the plot the above code produces
As you see, the graph lacks aesthetic. Can someone help me get a better graph please? May be a ggplot?
A plot like this would be great. In my case, Year
on x axis and Number of visitors
on y axis should appear and the lines should represent the forecasts and observed values.
Thanks in advance
Upvotes: 0
Views: 254
Reputation: 269854
We simplify the initial manipulation and then plot using ggplot2.
library(ggplot2)
library(zoo)
set.seed(123)
Number_of_visitor <- runif(396, 24000,6600000)
z <- zooreg(Number_of_visitor, freq = 12, start = 1990)
observed <- window(z, start = "2019-01")
training <- window(z, start = "2003-01", end = "2018-12")
arima.mod <- Arima(training, order = c(0, 1, 1),
seasonal = list(order = c(0, 1, 1), period = 12))
forecast_2019 <- predict(arima.mod, n.ahead = 12, se.fit = FALSE) |> as.zoo()
forecast_post <- predict(arima.mod, n.ahead = length(observed), se.fit = FALSE) |>
tail(-12) |>
as.zoo()
zz <- cbind(forecast_2019, forecast_post, observed)
autoplot(zz, facet = NULL) +
scale_color_manual(values = c("black", "red", "grey")) +
scale_x_yearmon(format = "%Y") +
scale_y_log10() +
xlab("")
Upvotes: 1
Reputation: 2321
One option is to convert your ts
objects to dataframes. There are a few ways to do this, but I went with the timetk
package since the code is straightforward. Then you can combine both datasets and plot them using ggplot()
aesthetics.
library(tidyverse)
library(forecast)
library(zoo)
Number_of_visitor <- runif(396, 24000, 6600000)
tsdata_trainig <- ts(Number_of_visitor[157:348], frequency = 12, start = c(2003,1))
arima.model <- forecast::Arima(tsdata_trainig, order = c(0, 1, 1), seasonal=list(order = c(0, 1, 1), period = 12))
forecast_2019 <- predict(arima.model, n.ahead = 12)$pred
tsdata_2019_22_observed <- ts(Number_of_visitor[349:396], frequency = 12, start = c(2019,1))
df1 <- timetk::tk_tbl(timetk::tk_tbl(forecast_2019)) %>% mutate(id = "forecast_2019")
df2 <- timetk::tk_tbl(timetk::tk_tbl(tsdata_2019_22_observed)) %>% mutate(id = "tsdata_2019_22_observed")
df3 <- bind_rows(df1, df2)
df3$index <- as.Date(zoo::as.yearmon(df3$index)) # convert index to date here
ggplot(df3, aes(x = index, y = value, color = id, linetype = id)) +
geom_point() + geom_path() +
xlab("Year") + ylab("People visited") +
scale_y_continuous(labels = scales::comma) +
scale_x_date(breaks = "1 year", date_labels = "%Y") +
scale_linetype_manual(values = c("solid", "dashed"))
Upvotes: 1