How can I create a multi-part response with DialogFlow?

Question

So far, I have a conversational app that works with webhooks to my backend PHP server that sends JSON responses back to the Dialogflow API. So far, its working rather well.

The next step in the development would be to have the Google Assitant respond to the user with multi-part responses. I've seen the "Lucky Trivia" game do something similar (screenshot attached).

It is not clear to me how I can have the Assistant App generate multiple bubbles.

Some solutions I've tried:

Using rich responses with multiple parts
Generating SSML responses and using several or
tags
Using message objects
Using a followupEvent object

None of these have gotten me to the point Id like.

Rich responses will work for a maximum of two separate bubbles and no more.

SSML seems promising and is a great way to add prosody and sound bites, but everything I've tried will not deliver multi-part speech bubbles.

I can't find a syntax for message objects that works with "platform":"google". Indeed, specific support for platform=google isn't listed on that page, but I have seen it in some request/response JSON objects.

The followupEvent response seemed most promising, but as far as I can tell, the intent that triggers from the named event completely replaces the current response, it doesn't just add onto it.

So, my question is: What's the best strategy for getting similar multi-part messages on Google Assistant using DialogFlow?

Optimally, I'd like to fire new requests to my webhook sequentially, but building one large response containing all parts is a viable option if necessary.

How does Lucky Trivia do this?

Prisoner · Accepted Answer

I suspect that Lucky Trivial is able to get around the rules because it was made by Google and doesn't use the same library that we do. But let's look at each of your attempts and then some possible other approaches.

What doesn't work

As you note, RichResponses are limited to only two SimpleResponses which translate to two text bubbles. You could make larger responses, but there is still a suggested limit of 300 characters per bubble, and a hard limit of 640 characters.

The SSML responses, as the name suggests, are about what you hear - not so much what you see.

Message objects are turned into native platform objects anyway, so unless there was some way to support it in Google (and there isn't), then you can't do it.

Follow-up events are specifically documented to ignore the text that is returned from the original event. Their entire point is to delegate processing to the other intent.

What might work: Cards

This doesn't look exactly the same as what you want, but one way to get additional text included that is separate from the two bubbles is through a Basic card as one of the rich response items. You can even do some basic formatting in the card and include graphics.

More complicated: Media Response

Including a Media response object with the rich response items is a way you can send multiple responses to the user without having to wait for them to say something. In this way, you can get multiple text bubbles in a row without the user having to reply.

The trick is that you'll send the two simple responses in the rich response, and then include a Media response with a very short, and possibly silent, audio file.

After the audio file finished playing, you'll get an intent that indicates the media has finished playing. You can then send another reply with one or two more simple responses. If necessary, you can repeat this.

There are some downsides - the media player will show while it is playing, which will interrupt the bubbles, but once done it should clear. There will also be a pause in between some of the bubbles. But playing audio might also enhance your reply.

How can I create a multi-part response with DialogFlow?

Answers (1)

Related Questions