Reputation: 23
I am making use of the Google TTS API and would like to use timepoints in order to show words of a sentence at the right time. (like subtitles). Unfortunately, I can not get this to work.
HTTP request
POST https://texttospeech.googleapis.com/v1beta1/text:synthesize
Request body
"input": {
"ssml": "<speak>Hello World</speak>"
},
"voice": {
"languageCode": "nl-NL",
"name": "nl-NL-Wavenet-E",
"ssmlGender": "FEMALE"
},
"audioConfig": {
"audioEncoding": "MP3"
},
"enableTimePointing": [
"SSML_MARK"
]
}
Response body
{
"audioContent": "base64"
"timepoints": [],
"audioConfig": {
"audioEncoding": "MP3",
"speakingRate": 1,
"pitch": 0,
"volumeGainDb": 0,
"sampleRateHertz": 24000,
"effectsProfileId": []
}
}
Im expecting a Timepoint object in return but as you can see, it returns an empty array.
Upvotes: 2
Views: 918
Reputation: 7287
For you to get timepoints, you just need to add <mark>
on your input. Here is an example using your request body.
Request body:
{
"input": {
"ssml": "<speak><mark name=\"1st\"/>Hello <mark name=\"2nd\"/>world</speak>"
},
"voice": {
"languageCode": "nl-NL",
"name": "nl-NL-Wavenet-E",
"ssmlGender": "FEMALE"
},
"audioConfig": {
"audioEncoding": "MP3"
},
"enableTimePointing": [
"SSML_MARK"
]
}
I added <mark name=\"1st\"/>
and <mark name=\"2nd\"/>
to create 2 marks to just to show how to add multiple marks. If you only need a single mark just remove the 2nd one and the response should just also show a single mark.
Response (I just included a snippet of the base64):
Upvotes: 2