Reputation: 157
I have split audio files consisting of all the English letters (A, B, C, D, etc.) into separate chunks of audio .wav files. I want to sort each letter into a group. For example, I want all the audio files of letter A grouped in one folder. So then I will have 26 folders consists of different sounds of the same letters.
I have searched for this, and I found some work done on K-mean clustering, but I could not achieve my requirement.
Upvotes: 0
Views: 474
Reputation: 5064
First of all, you need to convert the sounds into representation suitable for further processing, so some feature vectors for which you can apply classification or clustering algorithms.
For audio, typical choice are features based on spectrum. To process sounds, librosa can be very helpful.
Since sounds have different duration and you probably want a fixed-size feature vector for each recording, you need a way to build a single feature vector on top of series of data. Here, different methods can be used, depending on your data amount and availability of labels. Assuming you have limited amount of recordings and no labels, you can start with simply stacking several vectors together. Averaging is another possibility, but it destroys the temporal information (which can be ok in this case). Training some kind of RNN to learn representation as hidden state is the most powerful method.
Take a look on this related answer: How to classify continuous audio
Upvotes: 2