Reputation: 39
I am working on a cluster analysis with PAM in R. I computed the gower distance for my data with vegdist() and computing a cluster variable with pam() works well. Now I need a measure to determine the right k. The method I know is to visually compare the within sum of squares for different ks. How can I fetch the WSS from a series of PAM iterations to compare the sums in a plot, analogously to this example for kmeans? http://rstudio-pubs-static.s3.amazonaws.com/137758_a80b40255fdd440ab76b41a646a6c482.html#loops
Upvotes: 0
Views: 2772
Reputation: 77454
PAM does not optimize WSS. WSS is the k-means objective.
Instead, use the PAM objective (maybe called TD in literature?)
See ?[pam.object][1]
for the objective
field:
objective
the objective function after the first and second step of the pam algorithm.
Beware that similar to WSS, objective
is supposed to decrease with increasing k. Thus you can't just choose the minimum, but you should look for a knee in the plot.
Because PAM is randomized, you may want to run each k multiple times, and keep the best result only.
Upvotes: 1