Manish Sridhar
Manish Sridhar

Reputation: 45

Control of Text-to-Speech output audio in Microsoft WebChat

I am using Microsoft WebChat to connect to my Bot Service, and most of the interactions are mainly voice based. I am using Azure Speech Services, and the voice output is completely handled by WebChat. I am currently sending an inactive event to the Bot, which prompts an inactive user if an input was not received after a set duration. This is done based on incoming activity. However, in a voice driven scenario, the event is sent sometimes while the Bot is still speaking a message, because, even though the message has not been voiced out fully, the inactive event is sent, based on when the incoming activity was received. I would like to send the inactive prompt 'n' seconds after each message has been voiced out, but for this, I would need to know the duration of playback of each message. Is there a way by which I can get the duration of the output of each voice message provided by the Text-To-Speech Service, so that I can send the inactive prompt at the correct time?

Upvotes: 0

Views: 67

Answers (1)

Steven Kanberg
Steven Kanberg

Reputation: 6368

At this time, with respect to Web Chat, there is no way to capture the speech duration. I would recommend submitting this as a feature request for future development, if it is something you would like to see included. You can do so here.

It is a feature of cognitive services thru their REST API (see here), and seemingly, via the SDK (see here). You may be able to integrate speech STT/TTS directly into your project in order to make use of the "duration" property available in the response object.

Hope of help!

Upvotes: 1

Related Questions