Speech recognition with precise timestamp?

Question

Hy Community,

i´ve worked with Google´s txt to speech API.

When i would like to encode an wav audio file (extracted from an video), the timestamps for some words are not very precise. (the resolution according google is 0,1sec - but in my case, sometimes its more weak/delay).

I thought i could try a workaround by decrease the speed of audio file, but it´s more or less the same result.

Somebody know some precise API´s for speech recognition, or have some hints for better preparing the audio files?

i would like to determine one by one word including theire exact timestamps.

Thanks a lot!

Speech recognition with precise timestamp?

Answers (1)

Related Questions