Reputation: 818
I have created my own CMUSphinx language model for Arabic language for a software that will be listening to a user and apply commands with my own dictionary that I've done it manually by hand, converted "arpa" language model type to "dmp" language model using the command sphinx_lm_convert -i ar.lm -o ar.lm.dmp
, so here is the files that i have so far:
I then recorded my self of saying each word, each word has a its own .wav file and they are all in one folder that is separate from the folder where .dic, .txt, .lm exists.
My question is what is the next step as i was reading here http://cmusphinx.sourceforge.net/wiki/tutorial?
It says that Adapting existing acoustic model is the next step after building the language model, isn't it training the language model?
And if it is training, i have all the files required except the:
what should be inside these two files?
Thank
Upvotes: 2
Views: 459
Reputation: 25220
Procedure for training acoustic model is described in tutorial for Acoustic Model Training.
You need to create fileids and transcription files manually in a text editor or with a script if you want to convert existing transcription in any custom form to required format.
Fileids must list the file names, transcription file must list transcription for each of the files in a special format.
For example of acoustic model training database you can check inside an4 database.
Upvotes: 1