Reputation: 4973
I have a text in japanese that I'm turning into an mp3 with the Google Cloud Text to Speech functionality.
I also want to have word timestamps for the mp3 that gets returned by Google.
Google Speech to Text offers this functionality but when I submit the files I get from TTS to STT, the result is not always good.
What is the best way to also get word timestamps for the TTS mp3?
Upvotes: 1
Views: 569
Reputation: 679
Google Cloud Speech-to-Text it's a ML based service, so it's expected that the results are not always as "good" as you may expect them, it has it's limitations.
What I could suggest is to take a look at their relevant documentation about this topic like the best practices, the guide and the basics page that talk about it. Additionally, you could take a look at the issues within their issue tracker platform, like for example this issue for additional information on it and even if you find a reproducible issue within the service you can publish it there, so their team can be aware of it.
Upvotes: 0