gtx
gtx

Reputation: 1

How to output sequence of hidden states on rjags after training the HMM model?

I am totally new in HMM and JAGS. My question is how can I get the hidden state sequence based on the trained model structure and parameters. Specifically, I have trained a hidden Markov model by using rjags, and saved the sample results of model parameters, which contains 59 parameters with the lengths of 1000 (30000 iters, thining = 30). So right now, I have the model structure which has been correctly written in the manner of JAGS, and the updated parameters(30000 burn-in, 30000 sample iters with 30 thinning). The question is, how can I get the corresponding hidden states? I've been working on machine learning for a while, so my thought is loading the structure and parameters at first, then inputting a test sequence of observable states, then I will get what I want. But it seems different. So far, I didn't find any samples of this question, even in JAGS's document. Dose anyone know something about this? Thanks a lot. My training code is as below:


# -------------------  model ------------------------#

load('./data/try_3_gtx_3d.RData')

library(rjags)
library(mcmcplots)

data_1 <- list(Y = three_dim_smc_data-1, 
             M = dim(team_id_unique_ordered)[1], 
             N = one_dim_sample_count, 
             T = two_dim_sample_length, 
             gp = three_dim_gp_data, 
             sm = three_dim_sm_data)   # Y should be 0 and 1


inits_HMM <- list( beta0.p = runif(2, -10, 10))

parameters <- c("beta0", "PAD", "PDA", "bAD", "bDA", "sigmaAD", "sigmaDA", 
                "sigmab", "betagp", 
                "b", "delta0",  'betasm')

result <- jags.model("model/longitudinal_model_gp_sm_try1_gtx.txt", data_1, inits_HMM, n.chains = 3, n.adapt = 0)

update(result, n.iter = 30000)

rsamps_relience_pg_sm_try2 <- coda.samples(result, parameters, n.iter = 30000, 
                                           thin = 30)

save(rsamps_relience_pg_sm_try2, file = './results/rsamps_resilience_pg_sm_try3.RData')

mcmcplot(rsamps_relience_pg_sm_try2)

summary(rsamps_relience_pg_sm_try1)$statistics[-(c(3 : 317)),]
summary(rsamps_relience_pg_sm_try1)$quantiles[-(c(3 : 317)),]

If any other materials required, please inform me. Thank you so much!

I'm trying to output the sequences of hidden states corresponding to the given test observable state sequences, based on the model structure and well-trained parameters.

Upvotes: 0

Views: 111

Answers (1)

Th&#233;o Michelot
Th&#233;o Michelot

Reputation: 166

One approach would be to use the Viterbi algorithm to obtain the most likely state sequence for the new data, given the parameter estimates that you have. For background on the Viterbi algorithm and the general problem of state decoding, see for example Problem 2 in the "three basic problems of HMMs" outlined by Rabiner (1989), A Tutorial on Hidden Markov Models (Section III.B). Most commonly, this is done using the point estimates of the HMM parameters (e.g., posterior means in your case).

If you don't want to implement the Viterbi algorithm yourself, many existing R packages are available. For example, the code below shows the workflow in depmixS4, with a 2-state Gaussian HMM fitted to a time series of log-returns that is included in the package. The steps are:

  1. Fit model on training data
  2. Create new model for test data
  3. Copy estimated parameters from trained model using setpars()
  4. Find most likely state sequence using viterbi()

# I'm using a financial data set provided in depmixS4
library(depmixS4)
data(sp500)

# Split into training and test sets
train <- sp500[1:500,]
test <- sp500[-(1:500),]

# Create model for training data
par0 <- c(0, 0.01, 0, 0.1)
mod_train <- depmix(logret ~ 1, 
              data = train, 
              nstates = 2, 
              family = gaussian(), 
              respstart = par0)

# Fit model on training data
fit_train <- fit(mod_train)

# Create model for new/test data
mod_new <- depmix(logret ~ 1, 
                  data = test, 
                  nstates = 2, 
                  family = gaussian())

# Copy estimated parameters from trained model
mod_new <- setpars(mod_new, getpars(mod_train))

# Get most likely state sequence for new data
vit_new <- viterbi(mod_new)

# Plot log-returns in new data, coloured by states
plot(test$logret, col = vit_new$state, 
     type = "h", ylab = "log-return",
     main = "Decoded states for new data")
points(test$logret, col = vit_new$state, 
       pch = 20, cex = 0.5)

Log-returns for new data, coloured by Viterbi sequence

Upvotes: 0

Related Questions