Reputation:
I'm trying to build an ARIMA-model, to forecast the occupancy rates in an office. There are some NA's in the data, those dates are national holidays, which means no one is in the office and therefore there is no data. How can I deal with those NA values to build an ARIMA-model?
Example of NA:
2019-04-19 09:00:00 12.878788
2019-04-19 10:00:00 19.848485
2019-04-19 11:00:00 21.969697
2019-04-19 12:00:00 11.212121
2019-04-19 13:00:00 14.090909
2019-04-19 14:00:00 16.363636
2019-04-19 15:00:00 22.727273
2019-04-19 16:00:00 7.727273
2019-04-22 09:00:00 NA
2019-04-22 10:00:00 NA
2019-04-22 11:00:00 NA
2019-04-22 12:00:00 NA
2019-04-22 13:00:00 NA
2019-04-22 14:00:00 NA
2019-04-22 15:00:00 NA
2019-04-22 16:00:00 NA
2019-04-23 09:00:00 23.636364
2019-04-23 10:00:00 49.545455
2019-04-23 11:00:00 57.575758
2019-04-23 12:00:00 48.030303
2019-04-23 13:00:00 45.151515
2019-04-23 14:00:00 35.606061
2019-04-23 15:00:00 25.151515
2019-04-23 16:00:00 8.333333
I tried using this code:
plot(stl(ts, na.action = na.omit))
But I got this error:
Error in na.omit.ts(as.ts(x)) : time series contains internal NAs
Upvotes: 1
Views: 1711
Reputation: 31800
ARIMA models in R handle NA's without any problem. STL decompositions do not handle NAs, which is where your error is coming from.
If you want to do an STL, you could use mstl
from the forecast
package which estimates the missing values for you.
library(forecast)
library(ggplot2)
USAccDeaths[20:23] <- NA
USAccDeaths %>%
mstl(s.window="periodic") %>%
autoplot()
USAccDeaths %>%
auto.arima() %>%
forecast(h=24) %>%
autoplot()
Created on 2019-11-30 by the reprex package (v0.3.0)
Upvotes: 3