Reputation: 379
I'm performing word recognition by using a tradional procedure. I'm extracting MFCC features. Then I'm creating a code book in order to do vector quantization. After that, I train discrete HMM for two words: 1stWrod, 2dWord.
Until now, I have been performing the classification like this: I estimate the probability in the two trained models of a new audio segment with the proper feature extraction and quantization. I say the audio correspond to the class with the highest probability. That gives me good results.
But any audio segment is classified as any of those tho words, when sometimes it's not. I don't not how to say that is not corresponding to any class. I'm not sure if I could solve that by training another model with all of the other data, because it's very different and I think the model wouldn't be enough.
Upvotes: 2
Views: 163
Reputation: 961
A very easy approach would a score normalization.
First, for each word model (W1
and W2
) you need to compute the likelihood for a number of true positive test instances.
Then, you can model these likelihoods using a gaussian fit, computing mean value and standard deviation for each word model.
Finally, when it comes to check if an unknown word wj
belongs to W1
or W2
, you just have to normalize its score as follows:
for both model W1
and W2
, where LLj
is the log-likelihood of the j-th
word test instance.
Any score in the below -3
means the particular test word cannot be properly model by the model (either W1 or W2) used in the normalization process. If both normalized scores are less than -3, the test word cannot be model by neither W1
nor W2
, hence is another word.
You need a proper number of true positive test word for each model, in order to estimate properly mean values and standard deviations. Then, how much is a proper number, it depends on your actual data.
Upvotes: 1