Vaclav Kratochvíl
Vaclav Kratochvíl

Reputation: 129

Learning hidden markov model in R

A hidden Markov model (HMM) is one in which you observe a sequence of observations, but do not know the sequence of states the model went through to generate the observations. Analyses of hidden Markov models seek to recover the sequence of hidden states from the observed data.

I have data with both observations and hidden states (observations are of continuous values) where the hidden states were tagged by an expert. I would like to train a HMM that would be able - based on a (previously unseen) sequence of observations - to recover the corresponding hidden states.

Is there any R package to do that? Studying the existing packages (depmixS4, HMM, seqHMM - for categorical data only) allows you to specify a number of hidden states only.

EDIT:

Example:

data.tagged.by.expert = data.frame(
    hidden.state = c("Wake", "REM", "REM", "NonREM1", "NonREM2", "REM", "REM", "Wake"),
    sensor1 = c(1,1.2,1.2,1.3,4,2,1.78,0.65),
    sensor2 = c(7.2,5.3,5.1,1.2,2.3,7.5,7.8,2.1),
    sensor3 = c(0.01,0.02,0.08,0.8,0.03,0.01,0.15,0.45)
 )

data.newly.measured = data.frame(
    sensor1 = c(2,3,4,5,2,1,2,4,5,8,4,6,1,2,5,3,2,1,4),
    sensor2 =  c(2.1,2.3,2.2,4.2,4.2,2.2,2.2,5.3,2.4,1.0,2.5,2.4,1.2,8.4,5.2,5.5,5.2,4.3,7.8),
    sensor3 = c(0.23,0.25,0.23,0.54,0.36,0.85,0.01,0.52,0.09,0.12,0.85,0.45,0.26,0.08,0.01,0.55,0.67,0.82,0.35)
 )

I would like to create a HMM with discrete time t whrere random variable x(t) represents the hidden state at time t, x(t) {"Wake", "REM", "NonREM1", "NonREM2"}, and 3 continuous random variables sensor1(t), sensor2(t), sensor3(t) representing the observations at time t.

model.hmm = learn.model(data.tagged.by.user)

Then I would like to use the created model to estimate hidden states responsible for newly measured observations

hidden.states = estimate.hidden.states(model.hmm, data.newly.measured)

Upvotes: 4

Views: 2512

Answers (2)

Théo Michelot
Théo Michelot

Reputation: 186

This can now be done in the hmmTMB R package. (Disclaimer: I am the hmmTMB developer.) This feature of the package is described in the vignette "Advanced features of hmmTMB" (see Section 2, "Using a trained model on new data").

Below, I describe an example using the quakes data set.

Fit model to training data

The first step is to fit/train a model on the training data, to estimate the model parameters (transition probabilities, observation parameters). Here, we will be analysing a time series of earthquakes magnitude (the data frame quakes is loaded with the datasets package), with a 2-state Gaussian HMM.

We start by splitting the data into training and new data, and create an HMM with the training data. In hmmTMB, this requires creating an object for the hidden state model and an object for the observation model, and then combine them into an HMM object.

library(hmmTMB)

# Split time series into training data and "new" data
quakes_train <- quakes[1:800,]
quakes_new <- quakes[801:1000,]

# Create hidden state model
hid_train <- MarkovChain$new(data = quakes_train, 
                             n_states = 2)

# Create observation model
dists <- list(mag = "norm")
par0 <- list(mag = list(mean = c(4.2, 5), 
                        sd = c(0.5, 0.5)))
obs_train <- Observation$new(data = quakes_train, 
                             n_states = 2,
                             dists = dists, 
                             par = par0)

# Create hidden Markov model
hmm_train <- HMM$new(hid = hid_train, obs = obs_train)

# Fit HMM on training data
hmm_train$fit(silent = TRUE)

Create model for new data

We then create a separate model object, passing the new data rather than the training data. Crucially, we use init = hmm_train when we create the HMM object, to copy the parameters of the model fitted to the training data. This will be needed to estimate the hidden state sequence based on that fitted model.

# Create model for new data
hid_new <- MarkovChain$new(data = quakes_new, 
                           n_states = 2)
obs_new <- Observation$new(data = quakes_new, 
                           n_states = 2,
                           dists = dists, 
                           par = par0)

# Copy parameters from hmm_train when creating HMM object
hmm_new <- HMM$new(hid = hid_new, 
                   obs = obs_new, 
                   init = hmm_train)

Estimate hidden states for new data

Finally, we use the method $viterbi() to find the most likely state sequence for the new data set. Because the parameters in hmm_new were initialised to the values estimated from the fitted model, this is doing what you are asking. We could also use $state_probs() to get state probabilities, or $plot_ts() to plot the new data coloured by the most likely state sequence (plot included below).

# Get most likely state sequence
states_new <- hmm_new$viterbi()

# Plot new data coloured by most likely state sequence
hmm_new$plot_ts("mag") + 
    geom_point() +
    labs(title = "Decoded states for new data")

Time series plot of new data, coloured by estimated states.

Upvotes: 0

Vaclav Kratochv&#237;l
Vaclav Kratochv&#237;l

Reputation: 129

Data (training/testing)

To be able to run learning methods for Naive Bayes classifier, we need longer data set

states = c("NonREM1", "NonREM2", "NonREM3", "REM", "Wake")
artificial.hypnogram = rep(c(5,4,1,2,3,4,5), times = c(40,150,200,300,50,90,30))

data.tagged.by.expert = data.frame(
    hidden.state = states[artificial.hypnogram],
    sensor1 = log(artificial.hypnogram) + runif(n = length(artificial.hypnogram), min = 0.2, max = 0.5),
    sensor2 = 10*artificial.hypnogram + sample(c(-8:8), size = length(artificial.hypnogram), replace = T),
    sensor3 = sample(1:100, size = length(artificial.hypnogram), replace = T)
)

hidden.hypnogram = rep(c(5,4,1,2,4,5), times = c(10,10,15,10,10,3))
data.newly.measured = data.frame(
    sensor1 = log(hidden.hypnogram) + runif(n = length(hidden.hypnogram), min = 0.2, max = 0.5),
    sensor2 = 10*hidden.hypnogram + sample(c(-8:8), size = length(hidden.hypnogram), replace = T),
    sensor3 = sample(1:100, size = length(hidden.hypnogram), replace = T)
)

Solution

In the solution, we used Viterbi algorithm - combined with Naive Bayes classifier.

At each clock time t, a Hidden Markov Model consist of

  • an unobserved state (denoted as hidden.state in this case) taking a finite number of states

    states = c("NonREM1", "NonREM2", "NonREM3", "REM", "Wake")
    
  • a set of observed variables (sensor1, sensor2, sensor3 in this case)

Transition matrix

A new state is entered based upon a transition probability distribution (transition matrix). This can be easily computed from data.tagged.by.expert e.g. using

library(markovchain)
emit_p <- markovchainFit(data.tagged.by.expert$hidden.state)$estimate

Emission matrix

After each transition is made, an observation (sensor_i) is produced according to a conditional probability distribution (emission matrix) which depends on the current state H of hidden.state only. We will replace emmision matrices by Naive Bayes classifier.

library(caret)
library(klaR)
library(e1071)

model = train(hidden.state ~ .,
          data = data.tagged.by.expert,
          method = 'nb',
          trControl=trainControl(method='cv',number=10)
          )

Viterbi algorithm

To solve the problem, we use Viterbi algorithm with the initial probability of 1 for "Wake" state and 0 otherwise. (We expect the patient to be awake in the beginning of the experiment)

# we expect the patient to be awake in the beginning
start_p = c(NonREM1 = 0,NonREM2 = 0,NonREM3 = 0, REM = 0, Wake = 1)

# Naive Bayes model
model_nb = model$finalModel

# the observations
observations = data.newly.measured 
nObs <- nrow(observations) # number of observations 
nStates <- length(states)  # number of states

# T1, T2 initialization
T1 <- matrix(0, nrow = nStates, ncol = nObs) #define two 2-dimensional tables
row.names(T1) <- states
T2 <- T1

Byj <- predict(model_nb, newdata = observations[1,])$posterior
# init first column of T1
for(s in states)
  T1[s,1] = start_p[s] * Byj[1,s]

# fill T1 and T2 tables
for(j in 2:nObs) {
  Byj <- predict(model_nb, newdata = observations[j,])$posterior
  for(s in states) {
    res <- (T1[,j-1] * emit_p[,s]) * Byj[1,s] 
    T2[s,j] <- states[which.max(res)]
    T1[s,j] <- max(res)
  }
}

# backtract best path
result <- rep("", times = nObs)
result[nObs] <- names(which.max(T1[,nObs]))
for (j in nObs:2) {
  result[j-1] <- T2[result[j], j]
}

# show the result
result
# show the original artificial data 
states[hidden.hypnogram]

References

To read more about the problem, see Vomlel Jiří, Kratochvíl Václav : Dynamic Bayesian Networks for the Classification of Sleep Stages , Proceedings of the 11th Workshop on Uncertainty Processing (WUPES’18), p. 205-215 , Eds: Kratochvíl Václav, Vejnarová Jiřina, Workshop on Uncertainty Processing (WUPES’18), (Třeboň, CZ, 2018/06/06) [2018] Download

Upvotes: 1

Related Questions