JontroPothon
JontroPothon

Reputation: 536

Preprocessing tsibble to run time series models from fable package

I am trying to run some models on some monthly time series data. The times series data are not of equal length and also not starting/ending from/in the same month. What I have is a numeric month column and a numeric year column. I have created a time series from those two variables and made a tsibble out of it so that I can use the fable package. This is what I am doing to process the time series data,

I am posting a simulated data here.

# Packages
library(tidyverse)
library(tsibble)
library(fable)
library(fabletools)

# Simulated data
id <- c(rep (222, 28), rep(111, 36), rep(555, 16))
year <- c(rep(2014, 12), rep(2015, 12), rep(2016, 4), 
          rep(2014, 12), rep(2015, 12), rep(2016, 12), 
          rep(2015, 12), rep(2016, 4))
mnt <- c(seq(1, 12, by = 1), seq(1, 12, by = 1), seq(1, 4, by = 1),
         seq(1, 12, by = 1), seq(1, 12, by = 1), seq(1, 12, by = 1),
         seq(1, 12, by = 1), seq(1, 4, by = 1))
value <- rnorm(80, mean = 123, sd = 50)
dataf <- data.frame(id, mnt, year, value)

To make it a tsibble I am converting my month variable mnt into a character,

dataf$mnt[dataf$mnt == 1] <- "Jan"
dataf$mnt[dataf$mnt == 2] <- "Feb"
dataf$mnt[dataf$mnt == 3] <- "Mar"
dataf$mnt[dataf$mnt == 4] <- "Apr"
dataf$mnt[dataf$mnt == 5] <- "May"
dataf$mnt[dataf$mnt == 6] <- "Jun"
dataf$mnt[dataf$mnt == 7] <- "Jul"
dataf$mnt[dataf$mnt == 8] <- "Aug"
dataf$mnt[dataf$mnt == 9] <- "Sep"
dataf$mnt[dataf$mnt == 10] <- "Oct"
dataf$mnt[dataf$mnt == 11] <- "Nov"
dataf$mnt[dataf$mnt == 12] <- "Dec"

Adding month and year together

dataf %>% unite("time", mnt:year, sep = " ")

Make a tsibble

tsbl <- as_tsibble(dataf, index = time, key = id)

At this point, I am having this error,

> tsbl <- as_tsibble(dataf, index = time, key = id)
Error: `var` must evaluate to a single number or a column name, not a function
Call `rlang::last_error()` to see a backtrace.

The remaining codes are this,

# Fitting arima 
fit <- tsbl %>%
  fill_gaps(b = 0) %>% 
  model(
    arima = ARIMA(value),
  )
fit

# One month ahead forecast
fc <- fit %>%
  forecast(h = 1)
fc

# Accuracy measure
accuracy_table <- accuracy(fit)

Any idea how to preprocess my data to run forecasting models from fable package?

Upvotes: 1

Views: 317

Answers (1)

ravic_
ravic_

Reputation: 1831

You have two small issues where you are creating the time column. The first is that you aren't reassigning your results back to the dataf dataframe, but only posting results to the console. Resolving that will cure your error that you posted.

The next piece is that you'll need a compatible data type. A character isn't quite enough, and you'll want something like the tsibble function yearmonth() to get the job done. For that, you'll see I flipped the order of your unite() call.

The relevant piece:

dataf <- dataf %>% unite("time", c(year, mnt), sep = " ") %>%
  mutate(time = yearmonth(time))

Upvotes: 1

Related Questions