mango
mango

Reputation: 39

Compute within sum of squares from PAM cluster analysis in R

I am working on a cluster analysis with PAM in R. I computed the gower distance for my data with vegdist() and computing a cluster variable with pam() works well. Now I need a measure to determine the right k. The method I know is to visually compare the within sum of squares for different ks. How can I fetch the WSS from a series of PAM iterations to compare the sums in a plot, analogously to this example for kmeans? http://rstudio-pubs-static.s3.amazonaws.com/137758_a80b40255fdd440ab76b41a646a6c482.html#loops

Upvotes: 0

Views: 2772

Answers (1)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77454

PAM does not optimize WSS. WSS is the k-means objective.

Instead, use the PAM objective (maybe called TD in literature?)

See ?[pam.object][1] for the objective field:

objective

the objective function after the first and second step of the pam algorithm.

Beware that similar to WSS, objective is supposed to decrease with increasing k. Thus you can't just choose the minimum, but you should look for a knee in the plot.

Because PAM is randomized, you may want to run each k multiple times, and keep the best result only.

Upvotes: 1

Related Questions