Michael Schwartz
Michael Schwartz

Reputation: 63

Azure speech to text transcription doesn't run continuously

I originally ran an Azure speech-to-text model that transcribed up to 15 seconds of speech from a file. Now I'm trying to turn it into a model that transcribes longer utterances but the model still cuts out at 15 seconds of speech. The code is:

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;

namespace NEST {
    class Program {
        static async Task Main(string[] args) {
            await StartContinuousRecognitionAsync();
        }

        static async Task StartContinuousRecognitionAsync() {
            // Configure the subscription information for the service to access.
            // Use either key1 or key2 from the Speech Service resource you have created
            var config = SpeechConfig.FromSubscription("subscriptionkey", "region");

            // Setup the audio configuration, in this case, using a file that is in local storage.
            using(var audioInput = AudioConfig.FromWavFileInput("C:/Users/MichaelSchwartz/source/repos/AI-102-Process-Speech-master/transcribe_speech_to_text/media/spkr1.wav"))

            // Pass the required parameters to the Speech Service which includes the configuration information
            // and the audio file name that you will use as input
            using(var recognizer = new SpeechRecognizer(config, audioInput)) {
                Console.WriteLine("Recognizing first result...");
                var result = await recognizer.StartContinuousRecognitionAsync();

                switch (result.Reason) {
                case ResultReason.RecognizedSpeech:
                    // The file contained speech that was recognized and the transcription will be output
                    // to the terminal window
                    Console.WriteLine($"We recognized: {result.Text}");
                    break;
                case ResultReason.NoMatch:
                    // No recognizable speech found in the audio file that was supplied.
                    // Out an informative message
                    Console.WriteLine($"NOMATCH: Speech could not be recognized.");
                    break;
                case ResultReason.Canceled:
                    // Operation was cancelled
                    // Output the reason
                    var cancellation = CancellationDetails.FromResult(result);
                    Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

                    if (cancellation.Reason == CancellationReason.Error) {
                        Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                        Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}");
                        Console.WriteLine($"CANCELED: Did you update the subscription info?");
                    }
                    break;
                }
            }
        }
    }
}

The error returned is:

Cannot assign void to an implicitly-typed variable [NEST]csharp(CS0815).

How do I resolve this and transcribe utterances longer than 15 seconds? Thanks in advance.

Upvotes: 0

Views: 734

Answers (1)

Thiago Custodio
Thiago Custodio

Reputation: 18387

Not sure which version of the SDK you're using, but official docs use Delegates rather than result.Reason as it's in your code.

using var audioConfig = AudioConfig.FromWavFileInput("YourAudioFile.wav");
using var recognizer = new SpeechRecognizer(speechConfig, audioConfig);

var stopRecognition = new TaskCompletionSource<int>();

recognizer.Recognizing += (s, e) =>
{
    Console.WriteLine($"RECOGNIZING: Text={e.Result.Text}");
};

recognizer.Recognized += (s, e) =>
{
    if (e.Result.Reason == ResultReason.RecognizedSpeech)
    {
        Console.WriteLine($"RECOGNIZED: Text={e.Result.Text}");
    }
    else if (e.Result.Reason == ResultReason.NoMatch)
    {
        Console.WriteLine($"NOMATCH: Speech could not be recognized.");
    }
};

recognizer.Canceled += (s, e) =>
{
    Console.WriteLine($"CANCELED: Reason={e.Reason}");

    if (e.Reason == CancellationReason.Error)
    {
        Console.WriteLine($"CANCELED: ErrorCode={e.ErrorCode}");
        Console.WriteLine($"CANCELED: ErrorDetails={e.ErrorDetails}");
        Console.WriteLine($"CANCELED: Did you update the subscription info?");
    }

    stopRecognition.TrySetResult(0);
};

recognizer.SessionStopped += (s, e) =>
{
    Console.WriteLine("\n    Session stopped event.");
    stopRecognition.TrySetResult(0);
};

await recognizer.StartContinuousRecognitionAsync();

// Waits for completion. Use Task.WaitAny to keep the task rooted.
Task.WaitAny(new[] { stopRecognition.Task });

https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-speech-to-text?tabs=windowsinstall&pivots=programming-language-csharp

Upvotes: 1

Related Questions