Kelvin Tan
Kelvin Tan

Reputation: 992

Find timestamp of a word in an audio

I have a audio file of human speech. The length of the audio is about 1 minute. I want to find a timestamp of a word or a phrase spoken in the audio.

Is there any existing library that can do the task?

Upvotes: 1

Views: 955

Answers (1)

Colin Beckingham
Colin Beckingham

Reputation: 525

There are at least two ways to approach this issue: speech recognition and machine learning. Which is more suitable depends on your circumstances.

With speech recognition you could run the audio through an established speech-to-text recognizer and assess the timestamp of the word based on its distance from the beginning of the resulting string. With machine learning you would establish a model for the audio produced by the word or phrase from training data, then slice the test audio into suitable lengths and run each against the model to assess the likelihood of its being the word you are looking for.

The machine learning approach is likely to be the more accurate with respect to timestamp, but of course requires a lot of training data to establish the model in the first place.

Upvotes: 2

Related Questions