Reputation: 3507
I have created all the files that are required to run Sphinx4(Language Model, Dictionary and Acoustic Model). But when I run it in Eclipse, the following exception is thrown:
00:16:12.707 INFO unitManager CI Unit: AE
00:16:12.713 INFO unitManager CI Unit: AH
00:16:12.714 INFO unitManager CI Unit: B
00:16:12.714 INFO unitManager CI Unit: EY
00:16:12.715 INFO unitManager CI Unit: F
00:16:12.715 INFO unitManager CI Unit: IY
00:16:12.716 INFO unitManager CI Unit: JH
00:16:12.716 INFO unitManager CI Unit: L
00:16:12.717 INFO unitManager CI Unit: M
00:16:12.722 INFO autoCepstrum Cepstrum component auto-configured as follows: autoCepstrum {MelFrequencyFilterBank, DiscreteCosineTransform}
00:16:12.853 INFO dictionary Loading dictionary from: file:Alphabets/tutorial/alphabets/etc/alphabets.dic
00:16:12.853 INFO dictionary Loading filler dictionary from: file:Alphabets/tutorial/alphabets/model_parameters/alphabets.ci_cont/noisedict
00:16:12.854 INFO acousticModelLoader Loading tied-state acoustic model from: file:Alphabets/tutorial/alphabets/model_parameters/alphabets.ci_cont
00:16:12.854 INFO acousticModelLoader Pool means Entries: 30
00:16:12.855 INFO acousticModelLoader Pool variances Entries: 30
00:16:12.855 INFO acousticModelLoader Pool transition_matrices Entries: 10
00:16:12.855 INFO acousticModelLoader Pool senones Entries: 30
00:16:12.855 INFO acousticModelLoader Pool mixture_weights Entries: 30
00:16:12.856 INFO acousticModelLoader Pool senones Entries: 30
00:16:12.856 INFO acousticModelLoader Context Independent Unit Entries: 10
00:16:12.856 INFO acousticModelLoader HMM Manager: 10 hmms
00:16:12.860 INFO acousticModel CompositeSenoneSequences: 0
00:16:12.861 INFO largeTrigramModel Loading n-gram language model from: file:Alphabets/tutorial/alphabets/etc/alphabets.lm.dmp
00:16:12.867 INFO largeTrigramModel 1-grams: 3
00:16:12.867 INFO largeTrigramModel 2-grams: 1
00:16:12.867 INFO largeTrigramModel 3-grams: 1
00:16:13.094 INFO lexTreeLinguist Max CI Units 11
00:16:13.095 INFO lexTreeLinguist Unit table size 1331
Exception in thread "main" java.lang.IllegalArgumentException
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:111)
at edu.cmu.sphinx.linguist.WordSequence.getWord(WordSequence.java:179)
at edu.cmu.sphinx.linguist.language.ngram.large.LargeNGramModel.getNGramProbDepth(LargeNGramModel.java:409)
at edu.cmu.sphinx.linguist.language.ngram.large.LargeNGramModel.getNGramProbDepth(LargeNGramModel.java:412)
at edu.cmu.sphinx.linguist.language.ngram.large.LargeNGramModel.getNGramProbDepth(LargeNGramModel.java:412)
at edu.cmu.sphinx.linguist.language.ngram.large.LargeNGramModel.getProbDepth(LargeNGramModel.java:393)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist$LexTreeState.createWordStateArc(LexTreeLinguist.java:720)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist$LexTreeWordState.getSuccessors(LexTreeLinguist.java:1491)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.collectSuccessorTokens(WordPruningBreadthFirstSearchManager.java:635)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.growBranches(WordPruningBreadthFirstSearchManager.java:387)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.localStart(WordPruningBreadthFirstSearchManager.java:359)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.startRecognition(WordPruningBreadthFirstSearchManager.java:262)
at edu.cmu.sphinx.decoder.Decoder.decode(Decoder.java:62)
at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:109)
at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:125)
at edu.cmu.sphinx.api.AbstractSpeechRecognizer.getResult(AbstractSpeechRecognizer.java:50)
at Main.main(Main.java:30)
And this is the program I am running as stated on the official website:
import java.io.IOException;
import java.util.Scanner;
import edu.cmu.sphinx.api.Configuration;
import edu.cmu.sphinx.api.LiveSpeechRecognizer;
import edu.cmu.sphinx.api.SpeechResult;
public class Main {
public static void main(String[] args) {
Configuration configuration = new Configuration();
configuration
.setAcousticModelPath("Alphabets/tutorial/alphabets/model_parameters/alphabets.ci_cont");
configuration.setDictionaryPath("Alphabets/tutorial/alphabets/etc/alphabets.dic");
configuration
.setLanguageModelPath("Alphabets/tutorial/alphabets/etc/alphabets.lm.dmp");
LiveSpeechRecognizer recognizer = null;
try {
recognizer = new LiveSpeechRecognizer(configuration);
} catch (IOException e) {
e.printStackTrace();
}
recognizer.startRecognition(true);
SpeechResult result = recognizer.getResult();
recognizer.stopRecognition();
System.out.println(result.getHypothesis());
result.getLattice().dumpDot("lattice.dot", "lattice");
}
}
The help is highly appreciated!!
Upvotes: 1
Views: 224
Reputation: 25220
You language model /Alphabets/tutorial/alphabets/etc/alphabets.lm.dmp is in text arpa format but you added a dmp extension to it. This manual edit confuses the recognizer. To fix the issue rename alphabets.lm.dmp to alphabets.lm without dmp extension and edit the name in the code. Just use
configuration.setLanguageModelPath("Alphabets/tutorial/alphabets/etc/alphabets.lm");
You also do not have enough data to train the model, you model is not going to work. It's mandatory to have significant amount of data for training. You can find details in acoustic model training tutorial
http://cmusphinx.sourceforge.net/wiki/tutorialam
Upvotes: 1