Reputation: 59
I am a Java developer and I have couple of questions related to Google speech API V1Beta1.
I tried to upload (through GCS) small size (less than one min running file) audio file to google speech api it is working But the confidence output level is 0.32497215
only. That is my result is not exactly same to my audio input.
How to increase the confidence level output?
I tried big size audio file (more than one min running file). This case I used the API call:
https://speech.googleapis.com/v1beta1/speech:asyncrecognize?key=XXXXXXXXXXXXXXXXXXXX
and Payload:
"{"config":{"encoding":"LINEAR16","sample_rate": 16000},"audio":{"uri":"gs://" + bucketName +"/"+ objectName + ""}}"
Here I got the output json like
{"name": "57...........................95"}.
After getting this output I make new API call (Operation interface) with this name value.
https://speech.googleapis.com/v1beta1/operations/57.................................95?key=XXXXXXXXXXXXXXXXX
I got the output
{
"name": "57....................................95",
"done": true,
"response": {
"@type": "type.googleapis.com/google.cloud.speech.v1beta1.AsyncRecognizeResponse"
}
}
How to proceed the work with this value? I need to get audio speech text.
Please help me to fix this issues. Thanks in advance.
Upvotes: 3
Views: 1205
Reputation: 2202
Ideas to Question 1
:
You should give more details in RecognitionConfig
object, for example specify the languageCode
and add hints via the SpeechContext
object.
Answer to Question 2
:
Check the sample rate
of the audio file, you must be sure that is equal to the rate you gave in the request. You can check it e.g. with the following code soxi audio_file.flac
(sox
needed for this one).
Upvotes: 1