BUDDHIKA
BUDDHIKA

Reputation: 316

Twilio Transcribe voice in ASP.NET MVC

Im trying to transcribe vocal response of a caller and programmaticaly read out the users vocal response via twilio.

So, when user initially calls to the twilio number the call gets hooked to below action method (looked in to https://www.twilio.com/docs/voice/twiml/record?code-sample=code-record-a-voicemail&code-language=C%23&code-sdk-version=5.x) of the ASP.NET MVC Application.

[HttpPost]
public TwiMLResult Welcome()
{
    var response = new VoiceResponse();
    try
    {
        response.Say("Please say your user Id, example ABC123, \n and press star when done", Say.VoiceEnum.Alice, null, Say.LanguageEnum.EnGb);
        // record and transcribe users voice        
        response.Record(
        transcribe: true,
        transcribeCallback: new Uri("https://35eb31e3.ngrok.io/Ivr/HandleTranscribedVrn"),
        finishOnKey: "*");
        response.Say("I did not receive a recording");
    }
    catch (Exception e)
    {
        ErrorLog.LogError(e, "Error within ivr/Welcome");
        response = RejectCall();
    }

    return TwiML(response);
}  

Note - https://35eb31e3.ngrok.io/Ivr/HandleTranscribedVrn is the ngRok tunneled public URL to call back method.

So, Im trying to record the users voice input after user says his/her user Id and then presses * key. So, after pressing * , I expect twilio to transcribe and respond to below callback action method (https://35eb31e3.ngrok.io/Ivr/HandleTranscribedVrn) with the transcription text and other transcribed information.

[HttpPost]
public TwiMLResult HandleTranscribedVrn()
{
    var response = new VoiceResponse();
    try
    {
        // get the transcribed result - https://www.twilio.com/docs/voice/twiml/record#transcribe
        var result = new TranscribedResult
        {
            TranscriptionSid = Request.Params["TranscriptionSid"],
            TranscriptionText = Request.Params["TranscriptionText"],
            TranscriptionUrl = Request.Params["TranscriptionUrl"],
            TranscriptionStatus = Request.Params["TranscriptionStatus"],
            RecordingSid = Request.Params["RecordingSid"],
            RecordingUrl = Request.Params["RecordingUrl"],
            AccountSid = Request.Params["AccountSid"]
        };

        // reading the transcibed result
        response.Say("You said,\n {0}", result.TranscriptionText);

        // done
        response.Say("Good Bye", Say.VoiceEnum.Alice, null, Say.LanguageEnum.EnGb);
    }
    catch (Exception e)
    {
        ErrorLog.LogError(e, "Error within ivr/HandleTranscribedVrn");
        response.Say(ConversationHelper.NothingReceived, ConversationHelper.SpeakVoice, 1, ConversationHelper.SpeakLanguage);
    }
    return TwiML(response);
}

In brief, I want above callback action to grab the transcript to user voice input and read it out, like

You said, {Users Voice Transcript - example - abc123}, Good Bye

The Problem

When user calls to twilio number it executes Welcome() action controller, and says

"Please say your user Id, example ABC123, \n and press star when done"

The user says his/her user Id - EFG456 and presses * key as usual.

Then it again says (infinitely till user disconnects call), without going to transcribed call back action - HandleTranscribedVrn - "Please say your user Id, example ABC123, \n and press star when done"

Any help will be much appreciated.

Upvotes: 1

Views: 558

Answers (1)

BUDDHIKA
BUDDHIKA

Reputation: 316

With the help of the Twilio support we managed to find this solution. So, instead of <record> we have to use <gather> feature provided by Twilio. On gather we could either use speech, DTMF tones (keyboard inputs) or both. The gather complete callback method will be executed when the speech transcription is ready. More information can be found on https://www.twilio.com/docs/voice/twiml/gather

Below is the sample code. Hope it would be helpful to anyone who faces a similar issue.

[HttpPost]
public ActionResult Welcome()
{
    var response = new VoiceResponse();
    try
    {
        var gatherOptionsList = new List<Gather.InputEnum>
        {
            Gather.InputEnum.Speech,
            //Gather.InputEnum.Dtmf
        };
        var gather = new Gather(
            input: gatherOptionsList,
            timeout: 60,
            finishOnKey:"*",
            action: Url.ActionUri("OnGatherComplete", "Ivr")
            );
        gather.Say("Please say \n", Say.VoiceEnum.Alice, 1, Say.LanguageEnum.EnGb);
        response.Append(gather);           
    }
    catch (Exception e)
    {
        ErrorLog.LogError(e, "Error within ivr/Welcome");           
    }
    return TwiML(response);
}

[HttpPost]
public TwiMLResult OnGatherComplete(string SpeechResult, double Confidence)
{
    var response = new VoiceResponse();
    try
    {
        var identifyingConfidence = Math.Round(Confidence * 100, 2);
        var transcript = $"You said {SpeechResult} with Confidence {identifyingConfidence}.\n Good Bye";
        var say = new Say(transcript);         
        response.Append(say);
    }
    catch (Exception e)
    {
        ErrorLog.LogError(e, "Error within ivr/OnGatherComplete");          
    }
    return TwiML(response);
}

Upvotes: 2

Related Questions