Günal
Günal

Reputation: 751

Is An aesthetic plot for time series data possible in R?

I implemented a SARIMA model in R and decided to used a plot in order to visually investigate the performance of my model. Here is my code:

library(forecast)
Number_of_visitor <- runif(396, 24000,6600000)
tsdata_trainig <- ts(Number_of_visitor[157:348], frequency = 12, start = c(2003,1))
arima.model <- Arima(tsdata_trainig,order=c(0, 1, 1),
                 seasonal=list(order=c(0, 1, 1), period=12))

forecast_2019 <- predict(arima.model, n.ahead = 12)$pred # Predictions for 2019 to test my model
tsdata_2019_22_observed <- ts(Number_of_visitor[349:396], frequency = 12, start = c(2019,1)) # Observed data from 2019 to 2022
tsdata_post_pandemic_forecast <- ts(predict(arima.model, n.ahead = 48)$pred[13:48],frequency = 12, start = c(2020,1)) # Predictions for 2020, 2021 and 2022 

#Code for plot
ts.plot(forecast_2019, tsdata_2019_22_observed, log = "y", lty = c(1,3), xlab="Year", ylab="Number of Visitors")
lines(tsdata_post_pandemic_forecast,col="red")

Here is the plot the above code produces

enter image description here

As you see, the graph lacks aesthetic. Can someone help me get a better graph please? May be a ggplot?

A plot like this would be great. In my case, Year on x axis and Number of visitors on y axis should appear and the lines should represent the forecasts and observed values.

enter image description here

Thanks in advance

Upvotes: 0

Views: 254

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269854

We simplify the initial manipulation and then plot using ggplot2.

library(ggplot2)
library(zoo)

set.seed(123)
Number_of_visitor <- runif(396, 24000,6600000)

z <- zooreg(Number_of_visitor, freq = 12, start = 1990)
observed <- window(z, start = "2019-01")
training <- window(z, start = "2003-01", end = "2018-12")

arima.mod <- Arima(training, order = c(0, 1, 1),
                 seasonal = list(order = c(0, 1, 1), period = 12))
forecast_2019 <- predict(arima.mod, n.ahead = 12, se.fit = FALSE) |> as.zoo()
forecast_post <- predict(arima.mod, n.ahead = length(observed), se.fit = FALSE) |>
  tail(-12) |>
  as.zoo()

zz <- cbind(forecast_2019, forecast_post, observed)
autoplot(zz, facet = NULL) +
  scale_color_manual(values = c("black", "red", "grey")) +
  scale_x_yearmon(format = "%Y") +
  scale_y_log10() + 
  xlab("")

screenshot

Upvotes: 1

jrcalabrese
jrcalabrese

Reputation: 2321

One option is to convert your ts objects to dataframes. There are a few ways to do this, but I went with the timetk package since the code is straightforward. Then you can combine both datasets and plot them using ggplot() aesthetics.

library(tidyverse)
library(forecast)
library(zoo)

Number_of_visitor <- runif(396, 24000, 6600000)
tsdata_trainig <- ts(Number_of_visitor[157:348], frequency = 12, start = c(2003,1))
arima.model <- forecast::Arima(tsdata_trainig, order = c(0, 1, 1), seasonal=list(order = c(0, 1, 1), period = 12))
forecast_2019 <- predict(arima.model, n.ahead = 12)$pred 
tsdata_2019_22_observed <- ts(Number_of_visitor[349:396], frequency = 12, start = c(2019,1)) 

df1 <- timetk::tk_tbl(timetk::tk_tbl(forecast_2019)) %>% mutate(id = "forecast_2019")
df2 <- timetk::tk_tbl(timetk::tk_tbl(tsdata_2019_22_observed)) %>% mutate(id = "tsdata_2019_22_observed")
df3 <- bind_rows(df1, df2)
df3$index <- as.Date(zoo::as.yearmon(df3$index)) # convert index to date here

ggplot(df3, aes(x = index, y = value, color = id, linetype = id)) + 
  geom_point() + geom_path() +
  xlab("Year") + ylab("People visited") +
  scale_y_continuous(labels = scales::comma) + 
  scale_x_date(breaks = "1 year", date_labels = "%Y") +
  scale_linetype_manual(values = c("solid", "dashed"))

enter image description here

Upvotes: 1

Related Questions