Reputation: 45
I am using Microsoft WebChat to connect to my Bot Service, and most of the interactions are mainly voice based. I am using Azure Speech Services, and the voice output is completely handled by WebChat. I am currently sending an inactive event to the Bot, which prompts an inactive user if an input was not received after a set duration. This is done based on incoming activity. However, in a voice driven scenario, the event is sent sometimes while the Bot is still speaking a message, because, even though the message has not been voiced out fully, the inactive event is sent, based on when the incoming activity was received. I would like to send the inactive prompt 'n' seconds after each message has been voiced out, but for this, I would need to know the duration of playback of each message. Is there a way by which I can get the duration of the output of each voice message provided by the Text-To-Speech Service, so that I can send the inactive prompt at the correct time?
Upvotes: 0
Views: 67
Reputation: 6368
At this time, with respect to Web Chat, there is no way to capture the speech duration. I would recommend submitting this as a feature request for future development, if it is something you would like to see included. You can do so here.
It is a feature of cognitive services thru their REST API (see here), and seemingly, via the SDK (see here). You may be able to integrate speech STT/TTS directly into your project in order to make use of the "duration" property available in the response object.
Hope of help!
Upvotes: 1