Subha Nawer Pushpita
Subha Nawer Pushpita

Reputation: 109

Detecting whether an audio file has speech in python

I don't have so much idea about audio detection and I just started learning it today and came across webrtcvad(this felt poorly documented :'( ) and LibROSA. The task I have to do is that given an audio file, (which can be empty or it can have noise but no speech), I have to detect whether it contains any speech. Any idea how I can give it a start? Any help would be appreciated. Thanks in advance.

Upvotes: 2

Views: 6237

Answers (3)

amiasato
amiasato

Reputation: 1020

The speechmetrics package provides two measures of absolute speech quality, the MOSNet and the SRMR. You may pass your audio excerpt to those packages, check the returned qualities for silence/noise/speech and set thresholds accordingly.

Upvotes: 0

Ronie Martinez
Ronie Martinez

Reputation: 1275

Sounds like a generic question. There are several possible solutions though:

  1. Pass it to a speech to text recognition. If you got text, there's speech.
  2. In a more audio analysis method, use a frequency filter that only checks the range for human voice.

EDIT: Here are some libraries for processing audio

  1. librosa (https://github.com/librosa/librosa) - has lots of features but documentation makes it hard for beginners
  2. pydub (https://github.com/jiaaro/pydub) - easier to use compared to librosa but only has few features and represents audio differently from librosa (not easy to integrate with librosa)
  3. spleeter (https://github.com/deezer/spleeter) - separates vocals and other instruments

Upvotes: 1

dvnt
dvnt

Reputation: 193

Could this be done through frequency analysis? If so, depending on if you're using a mp3 or wav, these are the 2 options I'm aware of:

Upvotes: 0

Related Questions