Bryon
Bryon

Reputation: 1017

Elbow Method for optimal no. of clusters

I have a dataset that I am analysing to find the optimal number of clusters using k-means.

I am testing the number of clusters from [1..11] - which produces the following plot:

enter image description here

The original dataset has six classes but the elbow plot shows the bend really occurring at 3 clusters. For curiosity I overlaid a line on the plot from 11 clusters and back and it is almost a straight line to 6 clusters - which indicates to me that the real elbow is at 6, but it is subtle to see.

So, visually 3 looks to be the right answer, but given the known number of classes (6) the straight line I drew indicates 6...

Question:

Upvotes: 1

Views: 2815

Answers (1)

Naphat Amundsen
Naphat Amundsen

Reputation: 1623

Based on the plot I'd say that there are 6 clusters. From my experience and intuition, I believe it makes sense to say that the "elbow" is where the "within cluster sum of squares" begins to decrease linearly.

However, for cluster validation, I recommend using silhouette coefficients as the "right answer" is objectively obtained. In addition, the silhouette coefficients takes the separation of clusters into account as well.

Upvotes: 1

Related Questions