Reputation: 4129
I'm making a prototype for an accessibility app for Windows with Dragonfly for Python. Frustratingly, Windows Speech Recognition (WSR) recognizes audio from the computer, which is a huge problem for me as it recognizes speech generated by its own engine. For example, using speak
:
e = dragonfly.get_engine()
e.speak("Welcome. What can I do for you today? Please say a command.")
WSR, in its infinite wisdom, hears "Please say"
from the computer speakers and interprets that as "Yes"
. I've changed around the wording of the prompts but this is a consistent problem with many parts of the prototype. I would also prefer not to change my prompts to "Affirmative"
and forget "Yes"
because that seems like the opposite of accessible.
This is what my simple response class looks like:
class SingleWordResponse(CompoundRule):
spec = "<word>"
extras = [Choice("word", {"no":0, "yes":1, "ready":1, "okay":1,"repeat":2})]
mode = 0
def _process_recognition(self, node, extras):
#here I use self.mode to keep track of what the user is doing and respond based on the context
I'm open to various methods of disabling or circumventing this unwanted "feature". I've tried using different contexts but the context
documentation is not very clear on its use. I've also tried setting a speaking
attribute to prevent this but it doesn't seem to work. This is a test for the speaking
attribute:
class SingleWordResponse(CompoundRule):
spec = "<word>"
extras = [Choice("word", {"no":0, "yes":1, "ready":1, "okay":1,"repeat":2})]
speaking = False
def _process_recognition(self, node, extras):
if self.speaking == False:
print "command recognized"
#process command
#otherwise do nothing
I set the SingleWordResponse.speaking
to True
immediately before the e.speak()
call and then set it to False
immediately after but to no avail.
Upvotes: 2
Views: 350
Reputation: 314
One possible solution would be getting creative with Rule.disable() and Rule.enable(). That would at least stop your rule from being recognized when you don't want it to be.
However, it doesn't deal with WSR recognizing its own speech from the speak() function. Meaning, you're still going to get random text insertion when you use the speak() function. I thought you might be having a microphone quality problem, so I tested your code with my speaker volume up high and off. It seems that you are correct though in calling this a WSR problem: I still had the problem even with the speakers off.
Incidentally, and I hate to suggest the "throw more money at it" solution for various reasons, but Dragon NaturallySpeaking with Natlink+Dragonfly doesn't have this feedback problem (DNS versions 11/12 on Windows XP/7 at least, didn't test it elsewhere). WSR just isn't smart enough.
Upvotes: 1