Phorce
Phorce

Reputation: 2674

HMM - Training data and format

I'm wanting to implement an HMM (Hidden Markov Model) in order to identify particular words. So far, I have managed to extract the Coefficients (MFCC) of the signal and wondered if this is ok data in order to train the HMM?

Also, is the format (below) correct for training the HMM?

The format:

Foreach sample, there are a sequence of MFCC Coefficients, I have provided two of these samples as an example...

-13.8033 0.645476 3.2174 -0.625136 -0.470134 -2.96368 0.701151 0.464246 1.1898 -1.88515 0.0805242 0.311573 0.732487

-19.4252 -5.65454 0.853437 0.317219 0.146167 -1.93742 0.381944 -2.01793 -0.561144 -0.896783 -0.105491 -1.06504 -0.797318

Hope someone can help :)

Upvotes: 1

Views: 877

Answers (1)

jessica
jessica

Reputation: 379

You can have two approaches.

One is doing vector quantization on those vectors in order to convert the continuos MFCC vectors into discretes observations for the HMM.

Other is perform the training in HMM using a continuos approach. You can see more in this thread:

Simple speech recognition from scratch

Upvotes: 1

Related Questions