Reputation: 300
In my code (below) when I process it through STT it only gives me the first alphabet/word of the entire audio.
The audio has "A B C D E F"
What am I missing?
Imports Microsoft.CognitiveServices.Speech
Imports Microsoft.CognitiveServices.Speech.SpeechConfig
Imports Microsoft.CognitiveServices.Speech.Audio
Module Module1
Sub Main()
Dim SpeechConfig As SpeechConfig = FromSubscription("<CHANGED>", "eastus")
Dim audioConfig As Audio.AudioConfig = Audio.AudioConfig.FromWavFileInput("<CHANGED>.wav")
SpeechConfig.OutputFormat = Microsoft.CognitiveServices.Speech.OutputFormat.Detailed
Dim recognizer As New SpeechRecognizer(SpeechConfig, audioConfig)
Dim result = recognizer.RecognizeOnceAsync().Result
Select Case result.Reason
Case ResultReason.RecognizedSpeech
Console.WriteLine($"RECOGNIZED: Text={result.Text}")
Console.WriteLine($" Intent not recognized.")
Case ResultReason.NoMatch
Console.WriteLine($"NOMATCH: Speech could not be recognized.")
Case ResultReason.Canceled
Dim cancellation = CancellationDetails.FromResult(result)
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}")
If cancellation.Reason = CancellationReason.[Error] Then
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}")
Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}")
Console.WriteLine($"CANCELED: Did you update the subscription info?")
End If
End Select
End Sub
End Module
You can download the audio file on github here https://github.com/ullfindsmit/StackOverflowAssets/blob/master/abcdef.wav
Also, if you know where I could get a more detailed STT data i'd appreciate it. What I am looking for is like a JSON output that says start time and end time along with the word and/or sentence.
Your help is much appreciated.
UPDATE So The async handlers did not work for me for some reason However, the code below did
While True
Dim result = recognizer.RecognizeOnceAsync().Result
Select Case result.Reason
Case ResultReason.RecognizedSpeech
Console.WriteLine($"RECOGNIZED: Text={result.Text}")
Console.WriteLine($" Intent not recognized.")
Case ResultReason.NoMatch
Console.WriteLine($"NOMATCH: Speech could not be recognized.")
Case ResultReason.Canceled
Dim cancellation = CancellationDetails.FromResult(result)
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}")
If cancellation.Reason = CancellationReason.[Error] Then
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}")
Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}")
Console.WriteLine($"CANCELED: Did you update the subscription info?")
End If
Exit While
End Select
End While
Upvotes: 1
Views: 872
Reputation: 111
The RecognizeOnceAsync
method will only recognize "once" ... the first "utterance/phrase" contained in the audio data file. If you'd like to recognize more than one phrase, you can do one of these two things:
Call RecognizeOnceAsync
repeatedly... After the last phrase is recognized, the next call to the method will return a result that has result.Reason
set to Canceled
.
Switch from using RecognizeOnceAsync
to using StartContinuousRecognitionAsync
and hook an event hanlder up to the Recognizing
event. The event callback will allow you to see the results by inspecting the SpeechRecognitionEventArgs
passed, like this: e.Result
...
You can see both of these behaviors by running the Speech CLI like this:
spx recognize --once+ --key YOUR-KEY --region YOUR-REGION --file "https://github.com/ullfindsmit/StackOverflowAssets/blob/master/abcdef.wav"
spx recognize --continuous --key YOUR-KEY --region YOUR-REGION --file "https://github.com/ullfindsmit/StackOverflowAssets/blob/master/abcdef.wav"
You can download the Speech CLI here: https://aka.ms/speech/spx-zips.zip
Upvotes: 1