M. Talha Bin Asif
M. Talha Bin Asif

Reputation: 161

One Step Ahead Forecasting in R

I am using these minor data points to forecast the following intervals via one-step ahead Forecasting. For that, I have built a custom function to execute this but whenever I try to print the next interval it won't prints the value for 2022. I would appreciate it if someone would help me with this to forecast next year.

My data:

structure(list(Year = c(2012, 2013, 2014, 2015, 2016, 2017, 2018, 
2019, 2020, 2021), Adm.Numbers = c(1660, 1726, 1846, 1955, 2026, 
1999, 1954, 1924, 1952, 2078)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -10L))

Code:

summary(Ad1)
plot(ad_1, type="b", col = "black")

Arima_prediction_1 <- function(k) {
  
  range_validation <- 3
  n_ahead <- 1
  
  train_tbl <- Ad1 %>% slice((1 + k):(2 + k))
  valid_tbl <- Ad1 %>% slice((2 + 1 + k):(2 + k + range_validation)) 
  test_tbl  <- Ad1 %>% slice((2 + k + range_validation + 1):(2 + k + range_validation + n_ahead))
  
  train_arima <- bind_rows(train_tbl, valid_tbl) %>% select(1:2)
  test_arima <- test_tbl %>% select(1:2)
  
  # ARIMA model: 
  my_arima <- auto.arima(train_arima[, 2] %>% ts(start = 1))
  
  # Use the model for forecasting: 
  predicted_arima <- forecast(my_arima, h = 1)$mean %>% as.vector()
  
  actual_predicted_df_test <- test_arima %>% 
    mutate(predicted = predicted_arima) 
  
  return(actual_predicted_df_test)
  
}

options(scipen = 9999)
lapply(0:5, Arima_prediction_1) ->> arima_results
arima_results <- do.call("bind_rows", arima_results)
view(arima_results)

Upvotes: 0

Views: 1278

Answers (1)

DPH
DPH

Reputation: 4344

you do not see 2022 as you do not have this number/year in your dataframe (Ad1), which you use go extract the info from.

So you need to twist your code a little, mainly generating a corresponding sequence of years. Instead of dplyr::slice I used directly the index selection method for dataframes and also made some minor changes to your code:

library(forecast)
library(dplyr)

Arima_prediction_1 <- function(k) {

    range_validation <- 3
    n_ahead <- 1

    train_tbl <- Ad1[(1 + k):(2 + k), ]
    valid_tbl <- Ad1[(2 + 1 + k):(2 + k + range_validation), ] 
                            # get  max year from validation data and generate sequence to max year plus nhead and exlude first vector item
    test_tbl  <- data.frame(Year = seq(from = max(valid_tbl$Year), 
                                       to = max(valid_tbl$Year) + n_ahead)[-1],
                            Adm.Numbers = Ad1[(2 + k + range_validation + 1):(2 + k + range_validation + n_ahead), 2])

    train_arima <- rbind(train_tbl, valid_tbl) 
    test_arima <- test_tbl

    # ARIMA model: 
    my_arima <- forecast::auto.arima(ts(train_arima[, 2], start = 1))

    # Use the model for forecasting: 
    predicted_arima <- forecast::forecast(my_arima, h = 1)$mean %>% as.vector()

    actual_predicted_df_test <- test_arima %>% 
        dplyr::mutate(predicted = predicted_arima) 

    return(actual_predicted_df_test)

}

arima_results <- lapply(0:5, Arima_prediction_1) 
do.call("bind_rows", arima_results)

  Year Adm.Numbers predicted
1 2017        1999  2026.000
2 2018        1954  1981.793
3 2019        1924  1956.000
4 2020        1952  1971.600
5 2021        2078  1971.000
6 2022          NA  1981.400

I do not quite understand why the valid_tbl is separated from the train_tbl just to be merged/united/bound before actual calculations ... possibly you reduced the code complexity for the reprex.

Upvotes: 1

Related Questions