Reputation: 75127
I got the features of some sound variables with MFCC Algorithm. I want to cluster them with K-Means. I have 70 frames and every frame has 9 cepstral coefficients for one voice sample. It means that I have something like a 70*9 size matrix.
Let's assume that A, B and C are the voice records so
A is:
List<List<Double>> -> 70*9 array (I can use Vector instead of List)
and also B and C has same lengths too.
I don't want to cluster each frame, I want to cluster each frame block(at my example one group has 70 frames).
How can I implement it with K-Means at Java?
Upvotes: 3
Views: 2552
Reputation: 77454
K-Means has some pretty tough assumptions on your data. I'm not convinced that your data is appropriate to run k-means on it.
Side note: keep away from Java generics for primitive type such as Double. It kills performance. Use double[][]
.
Upvotes: 0
Reputation: 5144
Here's where your knowledge of the problem domain becomes crucial. You might just use a distance between the 70*9 matrices but you can probably better. I don't know the particular features you mention, but some generic examples might be average, standard deviation of the 70 values per feature. You're basically looking to reduce the num of dimensions, both to improve speed but also to make the measure robust against sImple transformations, like offsetting all values by one step
Upvotes: 3