Reputation: 536
I am trying to run some models on some monthly time series data. The times series data are not of equal length and also not starting/ending from/in the same month. What I have is a numeric month column and a numeric year column. I have created a time series from those two variables and made a tsibble
out of it so that I can use the fable
package. This is what I am doing to process the time series data,
I am posting a simulated data here.
# Packages
library(tidyverse)
library(tsibble)
library(fable)
library(fabletools)
# Simulated data
id <- c(rep (222, 28), rep(111, 36), rep(555, 16))
year <- c(rep(2014, 12), rep(2015, 12), rep(2016, 4),
rep(2014, 12), rep(2015, 12), rep(2016, 12),
rep(2015, 12), rep(2016, 4))
mnt <- c(seq(1, 12, by = 1), seq(1, 12, by = 1), seq(1, 4, by = 1),
seq(1, 12, by = 1), seq(1, 12, by = 1), seq(1, 12, by = 1),
seq(1, 12, by = 1), seq(1, 4, by = 1))
value <- rnorm(80, mean = 123, sd = 50)
dataf <- data.frame(id, mnt, year, value)
To make it a tsibble
I am converting my month variable mnt
into a character,
dataf$mnt[dataf$mnt == 1] <- "Jan"
dataf$mnt[dataf$mnt == 2] <- "Feb"
dataf$mnt[dataf$mnt == 3] <- "Mar"
dataf$mnt[dataf$mnt == 4] <- "Apr"
dataf$mnt[dataf$mnt == 5] <- "May"
dataf$mnt[dataf$mnt == 6] <- "Jun"
dataf$mnt[dataf$mnt == 7] <- "Jul"
dataf$mnt[dataf$mnt == 8] <- "Aug"
dataf$mnt[dataf$mnt == 9] <- "Sep"
dataf$mnt[dataf$mnt == 10] <- "Oct"
dataf$mnt[dataf$mnt == 11] <- "Nov"
dataf$mnt[dataf$mnt == 12] <- "Dec"
Adding month and year together
dataf %>% unite("time", mnt:year, sep = " ")
Make a tsibble
tsbl <- as_tsibble(dataf, index = time, key = id)
At this point, I am having this error,
> tsbl <- as_tsibble(dataf, index = time, key = id)
Error: `var` must evaluate to a single number or a column name, not a function
Call `rlang::last_error()` to see a backtrace.
The remaining codes are this,
# Fitting arima
fit <- tsbl %>%
fill_gaps(b = 0) %>%
model(
arima = ARIMA(value),
)
fit
# One month ahead forecast
fc <- fit %>%
forecast(h = 1)
fc
# Accuracy measure
accuracy_table <- accuracy(fit)
Any idea how to preprocess my data to run forecasting models from fable
package?
Upvotes: 1
Views: 317
Reputation: 1831
You have two small issues where you are creating the time
column. The first is that you aren't reassigning your results back to the dataf
dataframe, but only posting results to the console. Resolving that will cure your error that you posted.
The next piece is that you'll need a compatible data type. A character isn't quite enough, and you'll want something like the tsibble
function yearmonth()
to get the job done. For that, you'll see I flipped the order of your unite()
call.
The relevant piece:
dataf <- dataf %>% unite("time", c(year, mnt), sep = " ") %>%
mutate(time = yearmonth(time))
Upvotes: 1