Reputation:
I'm using the Amazon Lex service. My input is always a text message, but sometimes I'd like a spoken response in addition to the text. I configured an output voice in the Lex settings.
I've tried adding a header amz-lex:accept-content-types=SSML
to the request, but it returns with Invalid Bot Configuration: No usable messages given the current slot and sessionAttribute set. (Service: AmazonLexRuntime; Status Code: 400; Error Code: BadRequestException;
. The same request works just fine when I ask for PlainText
. And even if I ask for SSML,PlainText
it'll respond with plain text only.
Do I need to configure something else inside Lex to allow it to do voice responses?
Upvotes: 1
Views: 3122
Reputation: 1
You can utilize SSML(Speech Synthesis Markup Language) for this, even to test voice in Lex Test Bot Console, using message content-type.
Using SSML tags, you can customize and control aspects of speech, such as pronunciation, volume, and speech rate.
SSML comes with a variety of directives with which you can customize pronunciation and create based on your requirement . Eg - say-as directive
`"message": {
"contentType": "SSML",
"content": "<speak> Hi " + data["User ID"].split('.')[0]+", Your Reference Number <say-as interpret-as="characters">" + "ABC"+event.currentIntent.slots.RefNo+ "</say-as> is ," + data["Status"] +"</speak>"
}`
Introduced in 2018 - https://aws.amazon.com/about-aws/whats-new/2018/02/announcing-responses-capability-in-amazon-lex-and-ssml-support-in-text-response/
Upvotes: 0
Reputation: 3287
Lex cannot actually output voice by itself.
Lex will always output a JSON response and that response needs to be processed by the channel the user is accessing Lex with. So that channel is what outputs either text or voice based on how it processes the response message delivered from Lex.
Amazon Lex can handle speech-to-text.
Amazon Polly can do the reverse: text-to-speech.
If you go to the above Lex page, they have a few examples of using Lex for conversation logic and then Polly for text-to-speech and outputting voice to the user.
Upvotes: 3