R optimizing the number of states in hidden markov model

Question

I am trying to find a way to optimize the number of states in hidden markov model (HMM) in R. There are a number of R packages on HMM in R but I am having trouble estimating the optimal number of hidden states. Thank you for your help.

Backlin · Accepted Answer

To tune the number of hidden states you need a vector of possible number of hidden states nhs and a measure of performance perf() (some kind of error measure that can evaluate how good a model is). Then build one model for each number of hidden states and select the one giving you the best performance.

Here is a pseudo code example of how to do it.

nhs <- c(1, 2, 3, 5, 8, 11, 15)
error <- rep(NA, length(nhs))
for(i in 1:length(nhs)){
    fit <- train.HMM(data, nhs[i])
    error[i] <- perf(fit)
}
nhs[which.min(error)] # Optimal number of hidden states

I guess the performance measure in your case would be how good the model is at predicting the outcome of new unseen examples. I suggest you do cross validation for each number of hidden states. Something on these lines:

...
for(i in 1:length(nhs)){
    pred <- vector("list", k)
    for(fold in 1:k){
        fit <- train.HMM(data[not.in.fold.k], nhs[i])
        pred[[fold]] <- predict(fit, data[in.fold.k])
    }
    error[i] <- perf(pred)
}
...

The reason I didn't provide more detailed code is to not clutter the example (and since you did not provide a reproducible example to work from).

R optimizing the number of states in hidden markov model

Answers (2)

Related Questions