Selim Turkoglu
Selim Turkoglu

Reputation: 259

Audio Classification based on FFT

In python, I'm performing alarm recognition by only checking frequencies and amplitudes. My code takes the FFT of the 1s sound, then compares it with the predetermined frequencies and their amplitudes. Since alarms contains higher frequencies (6k-9kHz etc) and a long array (44100 different elements), I could success it without ML. Thanks to the FFT in high resolution, I can distinguish the amplitude changes even in close frequencies such as 7010Hz and 7016Hz and since these frequencies do not have any external noise in the recording environment, I can guess the correct alarms. However, I want to implement it with the ML since it is hard to perform it with lots of alarms. There are lots of audio classification sources/working examples etc but I couldn't find the one best suited for me. They usually use feature extraction, MFCC but I don't want to lose my resolution by using MFCC, because it combines close frequencies. So I only want to build an ML algorithm which checks only two arrays in every class; frequencies and amplitudes (both have 44100 elements) Can you suggest any source to build this algorithm? I checked below source which is OK but I don't want to use MFCC kind methods. If you comment, I can develop my question with examples. pyAudioClassification

Upvotes: 4

Views: 658

Answers (1)

Jon Nordby
Jon Nordby

Reputation: 6259

Alarms usually have characteristic temporal signatures in addition to being of a particular frequency. Either ondulating, or on/off patterns.

To detect these you should convert the STFT into a log-scaled melspectrogram. This you can classify using analysis time-windows of 100-1000 ms. Convolutional Neural Networks tend to do the best, but you might also be OK with just a Random Forest classifier.

Upvotes: 0

Related Questions