stats-hb
stats-hb

Reputation: 988

Impute missing values in timeseries via bsts

I work with a minutely timeseries with about 20% missing data (in varying lengths).

AFAIK bayesian methods can handle missing data elegantly and I would like to try to fit a bayesian timeseries model and then use the bayesian model to impute or extract the missing values (ideally returning a credible interval as well).

I'd hope to fit the model on the whole dataset, including the missing datapoints and then somehow impute the values simultaneously - avoiding the complexities (and computational cost) of rolling multi-horizon forecasts. I'm currently planning to use the "bsts"-package for the imputation, but I'm open to other options as well.

(I have tried forecast::na.imp as well as imputeTS::na.seadec for the imputation, but I would hope to improve the accuracy of the imputations a little more by including external regressors)

As you can see below, I haven't been able to extract a timeseries without missing values yet.

library(magrittr)
library(bsts)

# Load data
data(iclaims)
claims_nsa <- initial.claims$iclaimsNSA

# Create missing values
n <- length(claims_nsa)
na_pos <- 1:n %>%
  sample(size = 1/ 5 * n)
claims_nsa[na_pos] <- NA

# Fit Model
ss <- AddLocalLinearTrend(list(), claims_nsa)
ss <- AddSeasonal(ss, claims_nsa, nseasons = 52)
model1 <- bsts(claims_nsa,
               state.specification = ss,
               niter = 100,
               model.options = BstsOptions(save.full.state = TRUE))

# Fiddle around with model object
predict.bsts(model1, h=10)
str(model1)
model1$full.state %>% str()

Upvotes: 1

Views: 527

Answers (2)

stats-hb
stats-hb

Reputation: 988

I don't really know, what I am doing, but this seems to be working ok:

I think, I may have to aggregate the state contributions etc from the model object across the mcmc samples. The exact procedure may depend on the order of the model or so.

model1$state.contributions %>% 
  apply(c(2, 3), median) %>% 
  colSums()

This seems to aggregate across the realizations from the mcmc-iterations.

Upvotes: 1

Gonzalo Falloux Costa
Gonzalo Falloux Costa

Reputation: 372

Have you tried the mice package?

library(mice)

mice_mod <- mice(YourDataFrame[,VariablesYouWantToUseForImputationAndTheVariablesYouWantToImpute]
                   , method='norm') 

norm is bayesian linear regression

Upvotes: 0

Related Questions