Reputation: 1463
I am trying to figure out how to expose an express route (ie: Get api/word/:some_word) which uses the azure tts sdk (microsoft-cognitiveservices-speech-sdk) to generate an audio version of some_word (in any format playable by a browser), and res.send()'s the resulting audio, so that a front end javascript web app could consume the api in order to play the audio pronunciation of the word.
I have the azure sdk 'working' - it is creating an 'ArrayBuffer' inside my expressjs code. However, I do not know how to send the data in this ArrayBuffer to the front end. I have been following the instructions here: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-text-to-speech?tabs=import%2Cwindowsinstall&pivots=programming-language-javascript#get-result-as-an-in-memory-stream
Another way to phrase my question would be 'in express, I have an ArrayBuffer whose contents is an .mp3/.ogg/.wav file. How do I send that file via express? Do I need to convert it into some other data type(like a Base64 encoded string? A buffer?) Do I need to set some particular response headers?
Upvotes: 0
Views: 415
Reputation: 1463
I finally figured it out seconds after asking this question 😂
I am pretty new to this area, so any pointers on how this could be improved would be appreciated.
app.get('/api/tts/word/:word', async (req, res) => {
const word = req.params.word;
const subscriptionKey = azureKey;
const serviceRegion = 'australiaeast';
const speechConfig = sdk.SpeechConfig.fromSubscription(
subscriptionKey as string,
serviceRegion
);
speechConfig.speechSynthesisOutputFormat =
SpeechSynthesisOutputFormat.Ogg24Khz16BitMonoOpus;
const synthesizer = new sdk.SpeechSynthesizer(speechConfig);
synthesizer.speakSsmlAsync(
`
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="zh-CN">
<voice name="zh-CN-XiaoxiaoNeural">
${word}
</voice>
</speak>
`,
(resp) => {
const audio = resp.audioData;
synthesizer.close();
const buffer = Buffer.from(audio);
res.set('Content-Type', 'audio/ogg; codecs=opus; rate=24000');
res.send(buffer);
}
);
});
Upvotes: 2