lovemiao
lovemiao

Reputation: 51

Using microsoft speech recognition, could I get the moment when it starts and when it ends?

I am playing with the Microsoft engine about the speech recognition. The code is like:

static ManualResetEvent _completed = null;
static void Main(string[] args)
{
     _completed = new ManualResetEvent(false);
     SpeechRecognitionEngine _recognizer = new SpeechRecognitionEngine();
     _recognizer.LoadGrammar(new Grammar(new GrammarBuilder("test")) Name = { "testGrammar" }); // load a grammar
     _recognizer.LoadGrammar(new Grammar(new GrammarBuilder("exit")) Name = { "exitGrammar" }); // load a "exit" grammar
     _recognizer.SpeechRecognized += _recognizer_SpeechRecognized; 
     _recognizer.SetInputToDefaultAudioDevice(); // set the input of the speech recognizer to the default audio device
     _recognizer.RecognizeAsync(RecognizeMode.Multiple); // recognize speech asynchronous
     _completed.WaitOne(); // wait until speech recognition is completed
     _recognizer.Dispose(); // dispose the speech recognition engine
} 
void _recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
     if (e.Result.Text == "test") // e.Result.Text contains the recognized text
     {
         Console.WriteLine("The test was successful!");
     } 
     else if (e.Result.Text == "exit")
     {
         _completed.Set();
     }
}

It seems to work very cool. And the program could get when I talk like "test" or "exit". But could I get the exact moment when the program starts and when the program finishes testing and restarts to test another word?

Upvotes: 4

Views: 161

Answers (2)

Mark Hall
Mark Hall

Reputation: 54552

The SpeechRecognitionEngine has a SpeechDetected Event. You can use this to determine when it identifies the next word to process.

From above link's Remarks section(emphasis mine):

Each speech recognizer has an algorithm to distinguish between silence and speech. When the SpeechRecognitionEngine performs a speech recognition operation, it raises the SpeechDetected event when its algorithm identifies the input as speech. The AudioPosition property of the associated SpeechDetectedEventArgs object indicates location in the input stream where the recognizer detected speech. The SpeechRecognitionEngine raises the SpeechDetected event before it raises any of the SpeechHypothesized, SpeechRecognized, or SpeechRecognitionRejected events.

Upvotes: 0

Eric Brown
Eric Brown

Reputation: 13942

RecognitionResult.Audio has the start time and duration for the audio.

void SpeechRecognizedHandler(object sender, SpeechRecognizedEventArgs e)
{
  if (e.Result == null) return;

  // Add event handler code here.

  // The following code illustrates some of the information available
  // in the recognition result.
      Console.WriteLine("Grammar({0}): {1}", e.Result.Grammar.Name, e.Result.Text);
      Console.WriteLine("Audio for result:");
      Console.WriteLine("  Start time: "+ e.Result.Audio.StartTime);
      Console.WriteLine("  Duration: " + e.Result.Audio.Duration);
      Console.WriteLine("  Format: " + e.Result.Audio.Format.EncodingFormat);
}

Upvotes: 1

Related Questions