Reputation: 39
I want append speech recognition to asterisk server. I want try offline solution based on CMU Sphinx. But it work very slow. Reocgnition of simple dict(yes|no|normal) take about 20 seconds. I use this command:
pocketsphinx_continuous \
-samprate 8000 \
-dict my.dic \
-lm ru.lm \
-hmm zero_ru.cd_cont_4000 \
-maxhmmpf 3000\
-maxwpf 5\
-topn 2\
-ds 2\
-logfn log.log \
-remove_noise no \
-infile 1.wav
Is it possible reduce time to 1-2 seconds or i must see to online solution(Google, Yandex etc)
Upvotes: 0
Views: 1094
Reputation: 119
ASR and STT are 2 different things.
In the case of PocketSphinx, you can use the server mode and connect with MRCP (check the project uniMRCP). It is more efficent to not load the DATAs + engine for each recognition, but start the server once and connect with one or more MRCP clients.
Upvotes: 0
Reputation: 25220
You have a number of mistakes in your attempt:
Proper command would be:
pocketsphinx_continuous \
-samprate 8000 \
-dict ru.dic \
-lm my.jsgf \
-hmm zero_ru.cd_ptm_4000 \
-infile 1.wav
JSGF should look like this:
#JSGF V1.0;
grammar result;
public <result> = да | нет | нормально;
Whole time to run the command is
real 0m0.822s
user 0m0.789s
sys 0m0.028s
The actual recognition takes 0.02 seconds
INFO: fsg_search.c(265): TOTAL fsg 0.02 CPU 0.006 xRT
Upvotes: 2
Reputation: 15259
If you want to know, google cloud solution take 2.5-3.5 sec for 0-5sec recording.
Only faster option i know is google cloud in grpc(streaming realtime) version, which take 1sec after word end.
Speech recognition is VERY cpu intensive task. You can decrease recognition time by using faster CPU or using speech context with only few words. But it is really unlikly you get 10x faster recognition.
Upvotes: 1