Reputation: 3136
When running CMU Sphinx against a provided wav file, I get this error:
SEVERE: Missing HMM for unit P with lc=R rc=ER0
19:06:29.696 SEVERE lexTreeLinguist Bad HMM Unit: EH1
Aug 16, 2016 7:06:29 PM edu.cmu.sphinx.linguist.lextree.HMMTree addPronunciation
SEVERE: Missing HMM for unit N with lc=EH1 rc=Z
19:06:29.697 SEVERE lexTreeLinguist Bad HMM Unit: OW0
Exception in thread "main" java.lang.NullPointerException
at edu.cmu.sphinx.linguist.lextree.HMMNode.getBaseUnit(HMMTree.java:494)
at edu.cmu.sphinx.linguist.lextree.HMMNode.<init>(HMMTree.java:472)
at edu.cmu.sphinx.linguist.lextree.Node.addSuccessor(HMMTree.java:164)
at edu.cmu.sphinx.linguist.lextree.HMMTree$EntryPoint.createEntryPointMap(HMMTree.java:1154)
at edu.cmu.sphinx.linguist.lextree.HMMTree$EntryPointTable.createEntryPointMaps(HMMTree.java:1012)
at edu.cmu.sphinx.linguist.lextree.HMMTree.compile(HMMTree.java:784)
at edu.cmu.sphinx.linguist.lextree.HMMTree.<init>(HMMTree.java:706)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.generateHmmTree(LexTreeLinguist.java:428)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.compileGrammar(LexTreeLinguist.java:416)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.allocate(LexTreeLinguist.java:335)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.allocate(WordPruningBreadthFirstSearchManager.java:243)
at edu.cmu.sphinx.decoder.AbstractDecoder.allocate(AbstractDecoder.java:103)
at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:164)
at edu.cmu.sphinx.api.StreamSpeechRecognizer.startRecognition(StreamSpeechRecognizer.java:52)
at edu.cmu.sphinx.api.StreamSpeechRecognizer.startRecognition(StreamSpeechRecognizer.java:39)
I am using the CMUDict on Github and a language model on sourceforge
When googlin' this error, people hint that there is mis-match between the acoustic model and the dictionary. However, I can't find any documentation on which models / dictionaries are compatible. The CMU site does not provide any guidance. I've attempted several pairings, but I would be grateful for direct guidance.
Upvotes: 0
Views: 302
Reputation: 25220
Dictionary compatible with the model is already included in distribution and source tree.
The dictionary in github project has stress marks in phonemes which must be removed with the script before using in the decoder.
Upvotes: 0