user1019710
user1019710

Reputation: 321

How to detect the voice from an audio stream

I need to determine when someone speaks in an audio stream. I applied the Hamming window and calculated the FFT. How do i detect the human voice from here?

Upvotes: 2

Views: 3224

Answers (2)

hotpaw2
hotpaw2

Reputation: 70673

If you want to experiment with your own voice activity detection algorithms, an FFT can be used as an initial stage. Next you might want to try subtracting any characterized stationary spectral noise background. Then you could try using the modified FFT results to calculate a cepstrum (or some weighted cepstral coefficients) for feature extraction. You could then do some statistical pattern matching on whatever feature vectors you decided to extract, and feed the results to a decision algorithm.

Each of the above steps has likely been a research topic, and a good implementation might involve studying dozens of published research papers, which perhaps can be found in your university library.

Upvotes: 2

Paul R
Paul R

Reputation: 212949

You don't need to do an FFT for this, you need to implement a Voice Activity Detection algorithm.

Upvotes: 1

Related Questions