MG123
MG123

Reputation: 492

Azure Text to Speech (Cognitive Services) in web app - how to stop it from outputting audio?

I'm using Azure Cognitive Services for Text to Speech in a web app.

I return the bytes to the browser and it works great, however on the server (or local machine) the speechSynthesizer.SpeakTextAsync(inp) line outputs the audio to the speaker.

Is there a way to turn this off, since this runs on a web server (and even if I ignore it, there's the delay while it outputs audio before sending back the data)

Here's my code ...

            var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);

            speechConfig.SpeechSynthesisVoiceName = "fa-IR-FaridNeural";
            speechConfig.OutputFormat = OutputFormat.Detailed;

            using (var speechSynthesizer = new SpeechSynthesizer(speechConfig))
            {
                // todo - how to disable it saying it here?
                var speechSynthesisResult = await speechSynthesizer.SpeakTextAsync(inp);
                return Convert.ToBase64String(speechSynthesisResult.AudioData);
            }

Upvotes: 1

Views: 751

Answers (1)

Mohit Ganorkar
Mohit Ganorkar

Reputation: 2069

  • What you can do is add an audioconfig to the speechSynthesizer.

  • In this audioconfig object you can specify a file path to a .wav file which already exist on the server.

  • Whenever you run speaktextasyn instead of a speaker it will redirect the data to the .wav file.

  • This audio file you can read and perform your logic later.

  • Just add the following code before creating the speechSynthesizer object.

 var audioconfig = AudioConfig.FromWavFileOutput(filepath);

here filepath is a location of the .wav file as a string.

Complete code :

string filepath = "<file path> " ; 
var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion); 
var audioconfig = AudioConfig.FromWavFileOutput(filepath);


            speechConfig.SpeechSynthesisVoiceName = "fa-IR-FaridNeural";
            speechConfig.OutputFormat = OutputFormat.Detailed;

            using (var speechSynthesizer = new SpeechSynthesizer(speechConfig, audioconfig))
            {
                // todo - how to disable it saying it here?
                var speechSynthesisResult = await speechSynthesizer.SpeakTextAsync(inp);
                return Convert.ToBase64String(speechSynthesisResult.AudioData);
            }

Upvotes: 1

Related Questions