Sabhijiit
Sabhijiit

Reputation: 55

Meaning of O(logk) competitive complexity

I am working on an existing algorithm to improve its complexity. The existing algorithm uses K-means to perform clustering, whereas I chose to use K-means++ to do the same.

K-means++ was chosen because it mostly has faster and more accurate clustering results compared to K-mean.

Now, towards the end, where I have to compare the complexity of the new and existing algorithms, I find that I can't make sense of the fact that K-means++ has a complexity of O(logk) competitive.

I have tried looking everywhere on the web for an explanation, including stack overflow.

The only thing I have understood is that competitive has something to do with "on-line" and "off-line" algorithms. Could anyone please explain how it applies here?

Upvotes: 3

Views: 1443

Answers (1)

Matt Timmermans
Matt Timmermans

Reputation: 59194

The full sentence that you are reading says something like "The k-means++ clustering is O(log k)-competitive to the optimal k-means solution".

This is not a statement about its algorithmic complexity. It's a statement about its effectiveness. You can use O-notation for other things.

K-means attempts to minimize a "potential" that is calculated as the sum of the squared distances of points from their cluster centers.

For any specific clustering problem, the expected potential of a K-means++ solution is at most 8(ln k + 2) times the potential of the best possible solution. That 8(ln k + 2) is shortened to O(log k) for brevity.

The precise meaning of the statement that the k-means++ solution is O(log k)-competitive is that there is some constant C such that the expected ratio between the k-means++ potential and the best possible potential is less than C*(log k) for all sufficiently large k.

the smallest such constant is about 8

Upvotes: 4

Related Questions