Reputation: 13
I'm trying to implement hidden markov model training in python and the resultant numpy code seems very slow. It takes 30 minutes to train a model. Below is my code and I do agree that it is terribly inefficient. I tried learning about numpy vectorization and advanced indexing methods, but couldn't figure it as how to use them in my code. I could determine that much of the execution is concentrated and more then 99% of the execution time is taken by the reestimate()
function, especially the part where it prints CHK5 and CHK6.
def reestimate(self):
newTransition = numpy.zeros(shape=(int(self.num_states),int(self.num_states)))
newOutput = numpy.zeros(shape=(int(self.num_states),int(self.num_symbols)))
numerator = numpy.zeros(shape=(int(self.num_obSeq),))
denominator = numpy.zeros(shape=(int(self.num_obSeq),))
sumP = 0
i = 0
print "CHK1"
while i < self.num_states:
j=0
while j < self.num_states:
if j < i or j > i + self.delta:
newTransition[i][j] = 0
else:
k=0
print "CHK2"
while k < self.num_obSeq:
numerator[k] = denominator[k] = 0
self.setObSeq(self.obSeq[k])
sumP += self.computeAlpha()
self.computeBeta()
t=0
while t < self.len_obSeq - 1:
numerator[k] += self.alpha[t][i] * self.transition[i][j] * self.output[j][self.currentSeq[t + 1]] * self.beta[t + 1][j]
denominator[k] += self.alpha[t][i] * self.beta[t][i]
t += 1
k += 1
denom=0
k=0
print "CHK3"
while k < self.num_obSeq:
newTransition[i,j] += (1 / sumP) * numerator[k]
denom += (1 / sumP) * denominator[k]
k += 1
newTransition[i,j] /= denom
newTransition[i,j] += self.MIN_PROBABILITY
j += 1
i += 1
sumP = 0
i = 0
print "CHK4"
while i < self.num_states:
j=0
while j < self.num_symbols:
k=0
while k < self.num_obSeq:
numerator[k] = denominator[k] = 0
self.setObSeq(self.obSeq[k])
# print self.obSeq[k]
sumP += self.computeAlpha()
self.computeBeta()
t=0
print "CHK5"
while t < self.len_obSeq - 1:
if self.currentSeq[t] == j:
numerator[k] += self.alpha[t,i] * self.beta[t,i]
denominator[k] += self.alpha[t,i] * self.beta[t,i]
t += 1
k += 1
denom=0
k=0
print "CHK6"
while k < self.num_obSeq:
newOutput[i,j] += (1 / sumP) * numerator[k]
denom += (1 / sumP) * denominator[k]
k += 1
newOutput[i,j] /= denom
newOutput[i,j] += self.MIN_PROBABILITY,
j += 1
i += 1
self.transition = newTransition
self.output = newOutput
def train(self):
i = 0
while i < 20:
self.reestimate()
print "reestimating....." ,i
i += 1
Upvotes: 1
Views: 337
Reputation: 18488
It is straightforward to vectorize your inner loops. Here is an example for the second part of your code (untested of course):
print "CHK4"
for i in xrange(self.num_states):
for j in xrange(self.num_symbols):
for k in xrange(self.num_obSeq):
self.setObSeq(self.obSeq[k])
# print self.obSeq[k]
sumP += self.computeAlpha()
self.computeBeta()
alpha_times_beta = self.alpha[:,i] * self.beta[:,i]
numerator[k] = numpy.sum(alpha_times_beta[self.currentSeq == j])
denominator[k] = numpy.sum(alpha_times_beta)
denom = numpy.sum(denominator)
newOutput[i,j] = numpy.sum(numerator) / (sumP * denom) + self.MIN_PROBABILITY
self.transition = newTransition
self.output = newOutput
It might be possible to also vectorize the outer loops, but by far the biggest gain is usually obtained by focusing on the inner loops only. Some comments:
It seems that most of your while
loops can be turned into for
loops. Even though this does not make a lot of difference for speed, it is the preferred way if you know the number of iterations before the loop.
The convention is to use import numpy as np
, and use np.function
in the rest of the code
Simple loops that just compute a sum (accum = 0; for item in vector: accum += item
) should be vectorized like accum = np.sum(vector)
.
Conditional summing in a loop can be converted to a vectorized sum with boolean indexing, so accum = 0; for i in range(n): if cond[i]: accum += vector[i]
can be replaced with accum = np.sum(vector[cond])
I am interested to know how much faster your code becomes after these modifications, I guess you can easily gain more than a factor 10.
Upvotes: 1