Mostafa Fathollahi
Mostafa Fathollahi

Reputation: 41

How to increase Google's Speech Recognition accuracy for separated numbers

We give this image to our users:

enter image description here

This picture is representing separate numbers. And all of our users read it as "11-0-9-5" into their microphones.

We use Google Speech Engine, and it interprets this result:

"1109 5".

This makes it impossible for us to compare the spoken words with the expected result. And we're stuck in this phase.

Is there a way to tell Google's Speech Recognition to understand spoken numbers literally and separately, and do not join them together?

Upvotes: 2

Views: 960

Answers (1)

Dmitry
Dmitry

Reputation: 2052

You can try using speech context so that you constraint the GoogleSpeechEngine to stick to predefined numbers. https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#SpeechContext

So if you specify 0,1,2,3,4,5,6,7,8,9,10,11 as possible phrases google should not send back 1109 as it is not in the context.

However using this method you have to list all possible values which can be tedious. Some cases won't be solved. For exemple if someone is ponouncing 11 as 1-1.

Upvotes: 1

Related Questions