
Reputation: 43

A way to play Azure output format "audio-16khz-128kbitrate-mono-mp3" in Javascript

I'm calling the Azure TTS rest API, using the header X-Microsoft-OutputFormat with the value audio-24khz-160kbitrate-mono-mp3, and I don't know how to convert and play the audio from response. Does any know how to play the audio response when call Azure Cognitive services rest API?


I tried to convert using blob


 let wavFile = new Blob(res.data, { 
                                'type': 'audio/mp3' 

` but without success.

Upvotes: 1

Views: 686

Answers (2)

Ling Cao
Ling Cao

Reputation: 46

Please use fetch and ensure at least include following headers and payload:

const audio = document.createElement("audio");

fetch("{YourEndpointUrl}", {
  "headers": {
    "content-type": "application/ssml+xml",
    "ocp-apim-subscription-key": "{YourSpeechKey}",
    "x-microsoft-outputformat": "audio-24khz-160kbitrate-mono-mp3"
  "body": "<speak version=\"1.0\" xmlns=\"http://www.w3.org/2001/10/synthesis\" xmlns:mstts=\"http://www.w3.org/2001/mstts\" xml:lang=\"en-US\"><voice name=\"AdaptVoice\">My SSML</voice></speak>",
  "method": "POST"
.then(resp => resp.blob())
.then(url => {
  audio.src = url;

Or, use async/await is more concise:

const audio = document.createElement("audio");

const resp = await fetch("{YourEndpointUrl}", {
  "headers": {
    "content-type": "application/ssml+xml",
    "ocp-apim-subscription-key": "{YourSpeechKey}",
    "x-microsoft-outputformat": "audio-24khz-160kbitrate-mono-mp3"
  "body": "<speak version=\"1.0\" xmlns=\"http://www.w3.org/2001/10/synthesis\" xmlns:mstts=\"http://www.w3.org/2001/mstts\" xml:lang=\"en-US\"><voice name=\"AdaptVoice\">My SSML</voice></speak>",
  "method": "POST"
const blob = await resp.blob();
const url = await URL.createObjectURL(blob);
audio.src = url;

Upvotes: 3

Mohit Ganorkar
Mohit Ganorkar

Reputation: 2078

  • a work around would be that you use the JavaScript azure cognitive service libraries to convert text to speech.

  • This way it will generate a .wav file which you can then play using a node-wav-player npm package to play the file .

code for test to speech

var  sdk = require("microsoft-cognitiveservices-speech-sdk");
var  readline = require("readline")

var  audioFile = "YourAudioFile.wav";

// This example requires environment variables named "SPEECH_KEY" and "SPEECH_REGION"

const  speechConfig = sdk.SpeechConfig.fromSubscription( '< YOur KEY >', '<Your REgion >');

const  audioConfig = sdk.AudioConfig.fromAudioFileOutput(audioFile);

speechConfig.speechSynthesisVoiceName = "hi-IN-SwaraNeural";
var  synthesizer = new  sdk.SpeechSynthesizer(speechConfig, audioConfig);

var  rl = readline.createInterface({
        input:  process.stdin,
        output:  process.stdout

rl.question("Enter some text that you want to speak >\n> ",
    function (text) {
            function (result) {
                if (result.reason === sdk.ResultReason.SynthesizingAudioCompleted) {
                    console.log("synthesis finished.");
                } else {
                    console.error("Speech synthesis canceled, " + result.errorDetails +"\nDid you set the speech resource key and region values?");
                synthesizer = null;
            function (err) {
                console.trace("err - " + err);
                synthesizer = null;
        console.log("Now synthesizing to: " + audioFile);

The above code is from the MSDOC on text to speech using JavaScript.

The following Code which will play the .wav file :

const  player = require('node-wav-player');

    path:  './YourAudioFile.wav',
        }).then(() => {
            console.log('audio has started');
            }).catch((err) => {

Upvotes: 0

Related Questions