Brian Dolan
Brian Dolan

Reputation: 3136

CMU Sphinx Compatible Dictionaries

When running CMU Sphinx against a provided wav file, I get this error:

SEVERE: Missing HMM for unit P with lc=R rc=ER0
19:06:29.696 SEVERE lexTreeLinguist    Bad HMM Unit: EH1
Aug 16, 2016 7:06:29 PM edu.cmu.sphinx.linguist.lextree.HMMTree addPronunciation
SEVERE: Missing HMM for unit N with lc=EH1 rc=Z
19:06:29.697 SEVERE lexTreeLinguist    Bad HMM Unit: OW0
Exception in thread "main" java.lang.NullPointerException
at edu.cmu.sphinx.linguist.lextree.HMMNode.getBaseUnit(HMMTree.java:494)
at edu.cmu.sphinx.linguist.lextree.HMMNode.<init>(HMMTree.java:472)
at edu.cmu.sphinx.linguist.lextree.Node.addSuccessor(HMMTree.java:164)
at edu.cmu.sphinx.linguist.lextree.HMMTree$EntryPoint.createEntryPointMap(HMMTree.java:1154)
at edu.cmu.sphinx.linguist.lextree.HMMTree$EntryPointTable.createEntryPointMaps(HMMTree.java:1012)
at edu.cmu.sphinx.linguist.lextree.HMMTree.compile(HMMTree.java:784)
at edu.cmu.sphinx.linguist.lextree.HMMTree.<init>(HMMTree.java:706)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.generateHmmTree(LexTreeLinguist.java:428)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.compileGrammar(LexTreeLinguist.java:416)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.allocate(LexTreeLinguist.java:335)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.allocate(WordPruningBreadthFirstSearchManager.java:243)
at edu.cmu.sphinx.decoder.AbstractDecoder.allocate(AbstractDecoder.java:103)
at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:164)
at edu.cmu.sphinx.api.StreamSpeechRecognizer.startRecognition(StreamSpeechRecognizer.java:52)
at edu.cmu.sphinx.api.StreamSpeechRecognizer.startRecognition(StreamSpeechRecognizer.java:39)

I am using the CMUDict on Github and a language model on sourceforge

When googlin' this error, people hint that there is mis-match between the acoustic model and the dictionary. However, I can't find any documentation on which models / dictionaries are compatible. The CMU site does not provide any guidance. I've attempted several pairings, but I would be grateful for direct guidance.

Upvotes: 0

Views: 302

Answers (1)

Nikolay Shmyrev
Nikolay Shmyrev

Reputation: 25220

Dictionary compatible with the model is already included in distribution and source tree.

The dictionary in github project has stress marks in phonemes which must be removed with the script before using in the decoder.

Upvotes: 0

Related Questions