Improving speach-to-text recognition

I just started studing machine learning and related technologies. I chose speech recognition as a starting point. I tried Google Cloud Speech-to-Text and recognize the google sample and my own sample. As it turns, it didn't correctly recognize all the words in my sample.

How can I improve the recognition?
Is there any way to teach it for my own voice or for particular phrases?
Are there any other options besides Google Cloud?

Upvotes: 1

Answers (1)

Roberto Armenta Ahumada

Reputation: 418

Google Cloud Speech-to-Text (SST) is powered by pre-trained Machine Learning models, however, it's an ever improving service.

In order to ensure you are making the most out of SST please review the Best Practices as published in the public documentation, these include amongst other:

Sampling rate
Transmission codec
Background noise
Input channel usage
Frame size

Without your sample file it is hard to pinpoint where you need to work in order to improve the quality of the results, however, please note that Google tutorials are designed already considering the above mentioned best practices.
As a quick example, please note that in this How-to guide to Performing synchronous speech recognition on a local file two best practices can be found:

Encoding was done using LINEAR16 codec
Sampling rate is at 16000 hertz

Please review this document on how to optimize audio files to learn more.

Moving on, there are ways to adapt models to your specific needs, please review this document on how to improve transcription results, and based on your question this section on how to improve recognition of words and phrases; additionally you might want to dive into classes as these are really helpful when you are implementing for an specific business case.

There are plenty of options on Speech-to-Text and other ML/AI technologies, and it is hard to rank one over another, but please review this blog post on which this topic is explored.

Upvotes: 2

Improving speach-to-text recognition

Answers (1)

Related Questions