Reputation: 73
The code below is a Python implementation I found here of the Viterbi algorithm used in the HMM model. The link also gives a test case.
In __init__
, I understand that:
initialProb
is the probability to start at the given state, transProb
is the probability to move from one state to another at any given time, butthe parameter I don't understand is obsProb
. Can someone please explain it?
import numpy as np
'''
N: number of hidden states
'''
class Decoder(object):
def __init__(self, initialProb, transProb, obsProb):
self.N = initialProb.shape[0]
self.initialProb = initialProb
self.transProb = transProb
self.obsProb = obsProb
assert self.initialProb.shape == (self.N, 1)
assert self.transProb.shape == (self.N, self.N)
assert self.obsProb.shape[0] == self.N
def Obs(self, obs):
return self.obsProb[:, obs, None]
def Decode(self, obs):
trellis = np.zeros((self.N, len(obs)))
backpt = np.ones((self.N, len(obs)), 'int32') * -1
# initialization
trellis[:, 0] = np.squeeze(self.initialProb * self.Obs(obs[0]))
for t in xrange(1, len(obs)):
trellis[:, t] = (trellis[:, t-1, None].dot(self.Obs(obs[t]).T) * self.transProb).max(0)
backpt[:, t] = (np.tile(trellis[:, t-1, None], [1, self.N]) * self.transProb).argmax(0)
# termination
tokens = [trellis[:, -1].argmax()]
for i in xrange(len(obs)-1, 0, -1):
tokens.append(backpt[tokens[-1], i])
return tokens[::-1]
Upvotes: 1
Views: 8113
Reputation: 3043
A HMM with N
hidden states and M
possible discrete observation values is defined by the following parameters:
initialProb
(vector of size N
): The initial state distribution. The entry initialProb[i]
is the probability P(x_0 = i)
of being in state i
initially (at time 0).transProb
(matrix of size N
xN
): The transition probability matrix. The entry transProb[i][j]
is the probability P(x_{t+1} = j | x_t = i)
of transitioning from state i
to j
.obsProb
(matrix of size N
xM
): The emission probability matrix. The entry obsProb[i][j]
is the probability P(y_t = j | x_t = i)
of emitting symbol j
from state i
.Often, these parameters are named \pi
, T
and E
, respectively, or \pi
, A
and B
.
The standard reference on HMMs is the tutorial by Rabiner, by the way.
Upvotes: 5