Reputation: 304
I copied a code from a website to listen to specific words in python using pocketsphinx.It although runs but never outputs the keyword as expected.This is my code:
import sys, os
from pocketsphinx.pocketsphinx import *
from sphinxbase.sphinxbase import *
import pyaudio
# modeldir = "../../../model"
# datadir = "../../../test/data"
modeldir="C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//en-us"
dictdir="C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//cmudict-en-us.dict"
lmdir="C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//en-us.lm.bin"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', modeldir)
config.set_string('-lm', lmdir )
config.set_string('-dict', dictdir)
config.set_string('-keyphrase', 'forward')
config.set_float('-kws_threshold', 1e+20)
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()
# Process audio chunk by chunk. On keyword detected perform action and restart search
decoder = Decoder(config)
decoder.start_utt()
while True:
buf = stream.read(1024)
if buf:
decoder.process_raw(buf, False, False)
else:
break
if decoder.hyp() != None:
#print(decoder.hyp().hypstr)
if decoder.hyp().hypstr == 'forward':
print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
print ("Detected keyword, restarting search")
decoder.end_utt()
decoder.start_utt()
Also when I use print(decoder.hyp().hypstr)
It just outputs random words when i speak anything.For ex if i speak a word or line it outputs:
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the
the da
the head
the bed
the bedding
the heading of
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and
the bedding and well
the bedding and well
the bedding and well
the bedding and butler
the bedding and what lingus
the bedding and what lingus
the bedding and what lingus
the bedding and what lingus ha
the bedding and blessed are
the bedding and blessed are
the bedding and what lingus on
the bedding and what lingus want
the bedding and what lingus want
the bedding and what lingus want
the bedding and what lingus want
the bedding and what lingus want or
the bedding and what lingus want to talk
the bedding and what lingus current top
the bedding and what lingus want to talk
the bedding and what lingus want to talk
the bedding and what lingus want to talk
the bedding and what lingus want to talk
the bedding and what lingus want to talk to her
the bedding and what lingus want to talk to her
the bedding and what lingus want to talk to her
the bedding and what lingus want to talk to her
Please help me through it.I am just a newbie in python.
Upvotes: 1
Views: 1069
Reputation: 25220
You need to remove this line
config.set_string('-lm', lmdir )
Keyphrase search and lm search are mutually exclusive.
Upvotes: 1
Reputation: 51
Firstly, I just want to clarify; your Pocketsphinx is working.
So, based on my experience using pocketsphinx, it is hardly the most accurate voice recognition tool you can use, but probably your best bet for an Offline solution. Pocketsphinx can only translate your words (audio) as best as its' model prescribes. These models seem to still be a work in progress and much of it needs to be improved. There are a few things you can do to try increasing the accuracy of the recognition; such as reducing noise, and tuning the recognition, but that's outside the immediate scope of this question.
From what I understand in your code, you're looking for a specific keyword to be said (vocally, by the user) and have it recognized using pocketshinx's backend. This keyword seems to be "forward". You can read-up further on how to properly accomplish "hot word listening".
You have the right idea, but the approach can be improved. Here is my "quick fix" version of your code:
import os
import pyaudio
import pocketsphinx as ps
modeldir = "C://Users//hp//AppData//Local//Programs//Python//Python35//Lib//site-packages//pocketsphinx//model//"
# Create a decoder with certain model
config = ps.Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'en-us'))
config.set_string('-lm', os.path.join(modeldir, 'en-us.lm.bin'))
config.set_string('-dict', os.path.join(modeldir, 'cmudict-en-us.dict'))
config.set_string('-keyphrase', 'forward')
config.set_float('-kws_threshold', 1e+20)
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()
# Process audio chunk by chunk. On keyword detected perform action and restart search
decoder = ps.Decoder(config)
decoder.start_utt()
while True:
buf = stream.read(1024)
if buf:
decoder.process_raw(buf, False, False)
else:
break
if decoder.hyp() is not None:
print(decoder.hyp().hypstr)
if 'forward' in decoder.hyp().hypstr:
print([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
print("Detected keyword, restarting search")
decoder.end_utt()
decoder.start_utt()
For any one pocketsphinx.Decoder()
"session" (i.e. invoking the .start_utt()
method, without subsequently calling .ent_utt()
), the decoder.hyp().hypstr
variable will effectively continue to add words to itself once it detects that the input audio stream had a "valid" translation/recognition from pocketsphinx's decoding.
You've used if decoder.hyp().hypstr == 'forward':
. What this does is, it forces the whole string to be exactly "forward" for the code to enter that (I presume, desired... yes?) conditional codeblock. Since pocketshinx, by default, is not very accurate, it generally takes a few tries on most words to get it to actually register the correct word. For this reason, and since decoder.hyp().hypstr
adds to itself (as previously explained), I've used the line if 'forward' in decoder.hyp().hypstr:
. This looks for the desired keyword "forward" in the whole string. This way, it allows incorrect recognition until the keyword is found.
I hope it helps!
Upvotes: 1