Word error rates for Google Cloud Speech API vs Web Speech API

Question

I'm currently using W3C Web Speech API for Spanish and Mandarin. Overall the recognition is okay, but there are many errors (especially with single words), and sometimes transcribed Spanish words arbitrarily add accents, e.g., lo siento ==> lo síento.

I'm thinking of switching to a more robust and accurate API and found Google Speech API. While Web Speech API is free, I'd prefer to pay money for accuracy (lower error rates). In general, I do not a requirement for transcribing long audio files (6-8 word sentences usually max, but most often 1-4 word sentences) and intend to make these calls from the browser.

I cannot find documentation on the performances of these two APIs, so any help in making this decision to switch would be helpful.

Nikolay Shmyrev · Accepted Answer

Google speech api is not perfect either, you can get most accuracy from specialized solution.

Calling directly from the browser is not really an option for Google Speech API since you have to expose your API key in the browser, that is a bad idea, you'll have to maintain a server infrastructure anyway.

Word error rates for Google Cloud Speech API vs Web Speech API

Answers (1)

Related Questions