Microsoft Voice Recognition on a Single Word

Question

I am trying to implement a voice-cue system for a client where they can assign a word or a phrase to a slide in PowerPoint, and when they speak that word or phrase, the slide advances. Here is the code I am using to create the grammar (I use Microsoft's SpeechRecognitionEngine for the actual work).

Choices choices = new Choices();
string word = speechSlide.Scenes[speechSlide.currentslide].speechCue;
if (word.Trim() != "")
{
    choices.Add(word);
    GrammarBuilder builder = new GrammarBuilder(choices);
    Grammar directions = new Grammar(builder);
    return directions;
}

I tried raising the threshold for the confidence, however I still get too many false positives. Is there a way to improve the grammar? Something tells me that adding only one word to the grammar acceptance list is what is provoking all the false positives.

John Davis · Accepted Answer

Here is what I came up with:

As @Michael Levy said, the computer doesn't do much work when you give it one word to listen for. It basically just listens for when the audio levels hit a certain value, then assumes it must be that word. So I decided that I must give it other words that SOUND opposite. Now my goal was not to spend weeks research phonetics and figure out a perfect algorithm to determine words that sound far away from the word I am trying to match, so I decided to focus on the first letter. Here is the order of operations:

Extract the trigger word to progress slides from the XML file
Find first letter of word
Find 3 letters that are most unlike the sound of the letter found in step 2
Find 4 words of varying length, syllable count, end sound, and second letter that begin with each of the three letters found in step 3
Add all 12 words found in step 4 to the choices list, along with the trigger word. There are now 13 words. One is the word we found, and the other 12 sound nothing like the word. So the computer will be darn sure that it is correct before it fires any event handlers :)

Now to determine the opposite letters, I posted a question here, but it got shut down before I got any useful advice ): I don't know why, I checked the FAQ and it seems I was in the terms described there. I decided to poll my family and friends, and our combined brainpower came up with a list of opposites. Each letter has 3 letters that sound the furthers away from the original letter sound as possible.

The last step was to find words for each of these letters. I found four words per letter, for a total of 104 words. I wanted words of varying length, second letter, and end sound, so that I could cover all my bases and "distract" the computer away from the target word as much as possible. I used this University Vocab List to come up with big words, and used my puny English-mind to come up with words <5 letters, and in the end I felt I had a good list. I formatted it in XML, added the parsing code, and checked the results..... Much better! Almost too good! No false positives, and somebody with poor articulation will have a hard time using my program! I will make it a little easier, perhaps by removing the number of distraction words, but overall I was very pleased with the results, and appreciate the suggestions by @Michael Levy and @Kevin Junghans

Code:



  abnegate,apple,argent,axe
  berate,barn,bored,battology
  chrematophobia,cremate,cease,camoflauge
  dyslogy,distemper,dog,dilligent
  exoteric,esoteric,enumerate,elongate
  flagitious,flatulate,fart,funeral
  gracile,grace,garner,guns
  hebetate,health,habitat,horned
  isomorphic,inside,iterate,ill
  jape,juvenescent,jove,jolly
  kinetosis,keratin,knack,kudos
  lactate,lord,limaceous,launder
  malaria,mere,morbid,murcid
  name,nemesis,noon,nuncheon
  orarian,opiate,opossum,oculars
  pharmacist,phylogeny,pelt,puny
  query,quack,quick,quisquous
  random,renitency,roinous,run
  sand,searing,sicarian,solemn,
  tart,treating,thunder,thyroid
  unasinous,unit,ulcer,unthinkable
  version,visceral,vortex,vulnerable
  wand,weasiness,whimsical,wolf
  xanthopsia,xanthax,xylophone,xray
  yellow,york,yuck,ylem
  zamboni,zip,zoology,zugzwang

Parsing code:

    private Dictionary> opposites;
    private Dictionary> words = new Dictionary>();

    private void StartSpeechRecognition(Media_Slide slide)
    {
        if (opposites == null)
        {
            opposites = new Dictionary>();
            System.Xml.XmlDocument doc = new System.Xml.XmlDocument();
            string file = System.IO.Path.GetDirectoryName(Assembly.GetAssembly(typeof(MainWindow)).CodeBase).Remove(0, 6) + "\buzzlist.xml";
            doc.Load(file);
            foreach (System.Xml.XmlNode node in doc.ChildNodes[1].ChildNodes)
            {
                opposites.Add(node.Name, new List(node.Attributes[0].InnerText.Split(',')));
                words.Add(node.Name, new List(node.InnerText.Split(',')));
            }
        }

        speechSlide = slide;
        rec = new SpeechRecognitionEngine();
        rec.SpeechRecognized += rec_SpeechRecognized;
        rec.SetInputToDefaultAudioDevice();
        try
        {
            rec.LoadGrammar(GetGrammar());
            rec.RecognizeAsync(RecognizeMode.Multiple);
        }
        catch
        {
        }
    }

Checking code:

void rec_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
    {
        if (e.Result.Text == speechSlide.Scenes[speechSlide.currentslide].speechCue)
        {
            rec.UnloadAllGrammars();
            ScreenSettings.NextSlide(speechSlide);
            try
            {
                rec.LoadGrammar(GetGrammar());
            }
            catch
            {
                rec.RecognizeAsyncCancel();
            }
        }
    }

Microsoft Voice Recognition on a Single Word

Answers (2)

Related Questions