Guilherme
Guilherme

Reputation: 1

Text-to-speech not working when using Java API

So, I have text-to-speech turned on, plus the "enable beta features and API" on. On the DialogFlow web page where you can add intents and test them, the feature is working and I get a small audio control where I can hear the audio corresponding to the fullfilment text.

But, when I try to get the audio via the Java API, I'm not getting it. The code below will produce the following output:

>2018-07-30 02:26:53 mmcsrv.agent.SpeechIntentDetector: response_id
>
2018-07-30 02:26:53 mmcsrv.agent.SpeechIntentDetector: query_result
> 
2018-07-30 02:26:53 mmcsrv.agent.SpeechIntentDetector: webhook_status

I'd expect to find output_audio field in there but it's not, so where is the audio ?

My Maven for this module:

<dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-dialogflow</artifactId>
    <version>0.53.0-alpha</version> 
</dependency>

I tried 0.55.1-alpha but Maven says it doesn't exist. Not sure if not using the latest version would matter anyway.

Can someone help me ? If I can't get this to work, I'll have to send the text back to Google Cloud text-to-speech which I'm guessing will take more time than to have the audio data right there in Dialogflow's response.

Thanks.

// Details omitted for brevity...

// Build the DetectIntentRequest
DetectIntentRequest request = DetectIntentRequest.newBuilder()
          .setSession(session.toString())
          .setQueryInput(queryInput)
          .setInputAudio(wav)
          .build();

// Performs the detect intent request
DetectIntentResponse resp = sessionsClient.detectIntent(request);

List<FieldDescriptor> fields = resp.getDescriptorForType().getFields();

for (FieldDescriptor field : fields )
        log.trace(field.getName());

Upvotes: 0

Views: 373

Answers (1)

Guilherme
Guilherme

Reputation: 1

To answer my own question,

Dialogflow does send the audio, but you need the correct protobuffer proxy to be able to get it. If you're using Maven like me, version 0.53.0-alpha of artifact google-cloud-dialogflow pulls version 0.18.0 of proto-google-cloud-dialogflow-v2beta1 which has a proxy that does not yet has text-to-speech support.

You need to add version 0.20.1 or above by adding the this snippet to your pom file:

<dependency>
<groupId>com.google.api.grpc</groupId>
<artifactId>proto-google-cloud-dialogflow-v2beta1</artifactId>
<version>0.20.1</version>

Once you do that, class DetectIntentResponse will have the method getOutputAudio() that will give you the audio data.

I have it working now.

Upvotes: 0

Related Questions