How can I get long dictation results from the REST Speech Recognition API from Microsofts Cognitive Services?

Question

I was able to get short dictation answers from the REST API of Bing Voice Recognition. My goal is to get responses for audio-files that are longer than 15-30 seconds (aka long dictation mode). So what I do for getting the short answers is the following (I'm developing a HTML uwp app):

Generate an ArrayBuffer from an audio file (wav)
Authentication through Access Token
Send Audio data to REST API with the following settings:

var accessToken = [[accessTocken]];
var url = 'https://speech.platform.bing.com/recognize?'; 
var params = {
    'version': '3.0',
    'format': 'json',
    'locale': 'en-US',
    'device.os': 'Windows OS',
    'scenarios': 'smd',
    'appid': 'D4D52672-91D7-4C74-8AD8-42B1D98141A5',
    'requestid': guid(),
    'instanceid': guid()
};
var options = {
    url: url + $.param(params),
    type: "POST",
    headers: {
        'Authorization': 'Bearer ' + accessToken,
        'Content-Type': 'audio/wav; samplerate=16000'
    },
    data: data
};
return WinJS.xhr(options);

So this works! But how can I do this for long dictation scenarios?

Please don't reference the JavaScript GitHub repository at https://github.com/microsoft/Cognitive-Speech-STT-Javascript. This works only for short dictation AND is not working in the Edge browser.

How can I get long dictation results from the REST Speech Recognition API from Microsofts Cognitive Services?

Answers (1)

Related Questions