Reputation: 19
I am working with k-means and K-medoids. With K-means execution appear the following info:
Within cluster sum of squares by cluster:
[1] 12636160 7631152 10226254
(between_SS / total_SS = 79.2 %)
Is between_SS/ total_SS a rate that shows the general throughput from the algorithm?
And with pam:
Objective function:
build swap
211.6604 210.5670
How do you interpret these results?
Upvotes: 0
Views: 446
Reputation: 37661
If by "throughput" and "efficiency" you mean anything about processing speed, then no. These are all measures of how successful the clustering algorithm was at finding a good grouping (or perhaps how well these points can be grouped).
k-means
The meaning of between_SS (between clusters sum of squares) and
total_SS (total sum of squares) has been explained in this previous
Cross Validated
question and its answers. The ratio of between_SS to total_SS
is some measure of how well the points clustered.
PAM
From ?pam
help page
the algorithm first looks for a good initial set of medoids (this is called the build phase). Then it finds a local minimum for the objective function, that is, a solution such that there is no single switch of an observation with a medoid that will decrease the objective (this is called the swap phase).
The values listed are the values of the objective function (sum of distances of points to their medoid) at the two stages. Again, this is a measure of how well the points clustered.
For more details, see the pam help page ?pam
,
the pam.object help page ?pam.object
,
the Wikipedia Page
on k-medoids or
the original paper Kaufman, L. and Rousseeuw, P.J. (1987),
Clustering by Means of Medoids
Upvotes: 2