rajanb
rajanb

Reputation: 63

google cloud speech api returning empty result

I have been using the Chromium Google Speech API and switched over to using the Google Cloud Speech API recently. Ever since the Google cloud speech API got announced, the performance seems to have degraded in terms of the accuracy of recognition. Also I see that there are more and more "empty results" coming back for audio streamed.

I stream audio simultaneously to multiple different services and Google Cloud Speech API is returning empty result while some of the other services are returning transcribed text. Makes me wonder if there is anything changed in the way the Chromium Speech API and the Google Cloud Speech API work?

I validated the audio for proper headers and validated that I am streaming audio to Google.

Is anyone experiencing that Google sometimes (more like majority of the time) returning empty result?

Upvotes: 6

Views: 3543

Answers (3)

Andrey Azimov
Andrey Azimov

Reputation: 11

I also have same problem that Google Speech API returned empty result. I used FFmpgeg to convert my audio file to LINEAR16. For installation this tool I used Homebrew:

brew install ffmpeg

For converting my audio file to LINEAR16 I used this command:

ffmpeg -i input.flac -f s16le -acodec pcm_s16le output.raw

And after I loaded it to my Google stogage: https://console.cloud.google.com/storage/browser/

Here is my JSON file with config for making request:

{
  'config': {
      'encoding':'LINEAR16',
      'sampleRate': 16000,
      'languageCode': 'en-US'
  },
  'audio': {
      'uri':'gs://your-bucket-name/output.raw'
  }
}

For files more than 1 minute you need to use Asyncrecognize method:

curl -s -k -H "Content-Type: application/json" \
-H "Authorization: Bearer [YOUR-KEY]" \
https://speech.googleapis.com/v1beta1/speech:asyncrecognize \
-d @sync-request.json

it will return operation id. You can check if it's ready by get operation result:

curl -s -k -H "Content-Type: application/json" \
-H "Authorization: Bearer " [YOUR-KEY]\
https://speech.googleapis.com/v1beta1/operations/[OPERATION-ID]

Upvotes: 1

Nat Taylor
Nat Taylor

Reputation: 1108

I was also receiving empty responses but eventually got results by encoding with different settings.

sox async.wav -t raw --channels=1 --bits=16 --rate=16000 --encoding=signed-integer --endian=little async.raw

Upvotes: 1

Alex
Alex

Reputation: 315

This type of question is more appropriate for Public Issue Tracker as it would require further details in order to reproduce your exact errors. Make sure to fill in this form with the required information or at least with a minimal working example of your code clearly highlighting the problem. For an accurate reproduction, It would be important to provide the sample codes or commands that you executed and which returned the error alongside the configuration files and the URIs(or files) of the audio files you streamed and which returned empty results.

As a matter of fact, there exists known issues with the speech API that is currently in the Beta and so may prevent the transcription from working correctly. In the meantime, You may refer to the following documentation to determine if any of the best practices would apply to your case.

Upvotes: 3

Related Questions