Seed selection strategies for K-means

Question

I wonder what kind of seed selection methods I can apply to K-means algorithm. Google search wasn't that helpful. Any suggestions?

cyborg · Accepted Answer

The seeds depend on the domain. For example, if your data items are words, your seeds should be the most frequent words. Otherwise, you could cluster a small sample and use that as a seed.

Here is an example of a more sophisticated algorithm:

Single Pass Seed Selection Algorithm for k-Means. K. Karteeka Pavan, Allam Appa Rao, A.V. Dattatreya Rao and G.R. Sridhar. Journal of Computer Science 6 (1): 60-66, 2010. pdf

Seed selection strategies for K-means

Answers (2)

Related Questions