Reputation: 5
I am using Speech Recognition module in python to convert speech to text and spacy to extract some words from text . Can I get the audio sample or duration during which a certain word was spoken ?
for example I have an audio file . I get word 'orange' in text . I want to obtain the duration during which this specific word was spoken in audio file for example 3:10 to 3:12 , word orange was spoken
Thank you for your time
Upvotes: 0
Views: 127
Reputation: 1
i think we should take that into consideration while training a model. i know of no open source model that does that even thou it looks very simple if some one have any names share them please. you still can use google or azure api for that.
Upvotes: 0