Reputation: 170
First question on SO, hopefully it isn't too far out of left field.
Motivation: I'm working on (a fork of) Benoit's google2ubuntu voice-control tool.
Currently for it to work the user has to press a hotkey to call the program before starting to speak commands. I've implemented a hotword mode where a daemon (bash script, really) continuously runs in the background listening for sound above a preset threshold, records for 2 seconds and then sends the recording off to Google's speech-to-text API for conversion. It gets the returned result and then checks for the hotword, upon which it launches the actual program.
I'm looking for hotwords that are reliably recognized by the Google API. The API returns a text representation of what it thinks you said, along with a confidence level indicating how well its guess and your recording match up.
Using this we can compare the rates of detection for different hotwords: for instance the phrase "okay Google" is (not surprisingly) very well-recognized, regularly returning results like
"hypotheses": {"utterance": "Okay Google", "confidence": 0.95967352}
The more generic "okay computer" is not as recognized as reliably, but still does alright at an average confidence level of 0.85. Some more obscure phrases I've tested include "okay Jarvis" (if we're going to make a voice-controlled computer...) which is unfortunately hit-and-miss with high confidence levels half the time and complete misses otherwise. "Okay Linux" on the other hand is not recognized at all.
Question: Does anyone know what sort of phrases are reliably recognized by the Google API?
Examples of good hotwords are short phrases that would not commonly appear in daily speech (otherwise we'd set off the program every time we had a conversation), but yet are "special" enough to be recognizable even by dumb computers.
Upvotes: 1
Views: 802
Reputation: 25220
It's better to listen with offline keyword detector like the on recently implemented in CMUSphinx. So there is no need to stream all audio to google, no need to keep internet connection and response is fast. Keyphrase is configurable and detection threshold can be tuned. You competitors already integrated this into their assistants, for example in Pocketsphinx Android Demo. It's possible to use keyword spotting from python api too.
Upvotes: 2