Reputation: 2688
for a project I want to implement a color-clustering algorithm, which replace similar colors with the average color of a cluster.
For now, I use the kmeans-algorithm to cluster the whole image . But this take's a long time. Has someone an idea how to use kmeans to cluster a color-histogram , so I can perform this algorithm?
Upvotes: 7
Views: 6203
Reputation: 77464
Downsample the image first, then run k-means.
If you resize the image to 1/2th in both x and y, it shouldn't affect colors much, but k-means should take at most 1/4th of the time. If you resample to 1/10 of the width and height, k-means should run 100 times faster.
https://en.wikipedia.org/wiki/Color_quantization
By downsampling the image, you have less "pixels" to process during clustering. But in the end, it should produce roughly the same color scheme.
So the real output is not an image or image regions. It's the palette.
You can then map an arbitrary image (including the full resolution version) to this color palette by simply replacing each pixel with the closest color!
The complexity of k-means is O(n*k*i)
, where n
is the number of pixels you have, k the desired number of output colors and i the number of iterations needed until convergence.
n
: by downsampling, you can easily reduce n
, the largest factor. In many situations, you can reduce this quite significantly before you see a degradation in performance.
k
: this is your desired number of output colors. Whether you can reduce this or not depends on your actual use case.
i
: various factors can have an effect on convergence (including both other factors!), but the strongest probably is having good starting values. So if you have a very fast but low quality method to choose the palette, run it first, then use k-means to refine this palette. Maybe OpenCV already includes an appropriate heuristic for this though!
You can see, the easiest approach is to reduce n
. You can reduce n
significantly, produce an optimized palette for the thumbnail, then rerun k-means on the full image refinining this palette. As - hopefully - this will reduce the number of iterations significantly, this can sometimes perform very well.
Upvotes: 6
Reputation: 8725
My answer is not connected with histogram clusterization but recently I need to speedup clusterization procedure of my algorithm. For this I did the following:
And this really helped me to speedup clusterization in some times. Also you can try to play around with OpenCV's mean-shift filtering.
Upvotes: 1
Reputation: 3988
You need to assign a weight for each data, i.e. the number of values in the histogram bin. Then, when you compture the new value for cluster centroids, you use a weighted average instead of plain average. But the interface of OpenCV KMeans clustering does not support weighted values. YOu can use the C clustering library which does support it, is quite well documented (although takes examples from bioinformatics), and is easy to integrate (a single .h/.c file).
Upvotes: 0