Harry Stuart
Harry Stuart

Reputation: 1945

Google Speech streaming recognition slow response time

What is the fastest expected response time of the Google Speech API with streaming audio data? I am sending an audio stream to the API and am receiving the interim results with a 2000ms delay, of which I was hoping I could drop to below 1000ms. I have tested different sampling rates and different voice models.

Upvotes: 2

Views: 5461

Answers (3)

Maksim Shamihulau
Maksim Shamihulau

Reputation: 1748

Google Cloud Speech itself works pretty fast, you can check how quick your microphone gets transcribed https://cloud.google.com/speech-to-text/.

You may probably experience buffering issue on your side, the tool you are using may buffer data before sending(buffer flush) to underlying device(stream).

You can find out how to decrease output buffer of that tool to lower values e.g. 2Kb, so data will reach Node app and Google service faster. Google recommends to send data that equals to 100ms buffer size.

Upvotes: 0

rsantiago
rsantiago

Reputation: 2099

I'm afraid that response time can't be measured or guaranteed because of the nature of the service. We don't know what is done under the hood, in fact there is no SLA for response time even though there is SLA for availability.

Something that can help you is working on building a good request:

  1. Reducing 100-miliseconds frame size, for example, could ensure a good tradeoff between latency and efficiency.
  2. Following Best Practices will help you to make a clean request so that the latency can be reduced.

You may want to check following links on specific uses cases to know how they addressed latency issues:

Upvotes: 1

Nikolay Shmyrev
Nikolay Shmyrev

Reputation: 25220

If you really care about response time you'd better use Kaldi-based service on your own infrastructure. Something like https://github.com/alumae/kaldi-gstreamer-server together with https://github.com/Kaljurand/dictate.js

Upvotes: 0

Related Questions