ChatCloud
ChatCloud

Reputation: 1200

Command detection with Deep Neural Networks using Kaldi without binding to a language

Did anybody see any samples how set up simple application to train dnet and then use it to recognize it a limited number of voice commands without binding to a particular language? I believe Kaldi API is quite powerful for it but there is a lack of documentation.

Upvotes: 0

Views: 510

Answers (1)

Nikolay Shmyrev
Nikolay Shmyrev

Reputation: 25220

1) You take existing DNN model or train it yourself. You can use Tedlium experiment from Kaldi, it is free to run. It does not matter if model is for English, it will work for other languages too.

2) You extract DNN posteriors from both training keyphrases. nnet3-am-compute tool can be used for that. It takes DNN model and returns phonetic or state posteriors for every frame.

3) You implement DTW algorithm to compare DNN posteriors. This part you have to do yourself, it is not implemented in Kaldi.

Related papers describing the algorithm:

Investigating Neural Network based Query-by-Example Keyword Spotting Approach for Personalized Wake-up Word Detection in Mandarin Chinese

Query-By-Example Spoken Term Detection Using Phonetic Posteriorgram Templates

Upvotes: 0

Related Questions