Reputation: 71
I wonder if it's normal to have such high score when performing Elbow methods or silouette score :
I have a database, I drop all the unnecessary columns (ID mostly), encoded all the categorical variable, deall with NA, scaled the feature but I'm in doubt. Those number are huge :
Elbow methods :
Upvotes: 2
Views: 44
Reputation: 15
If your dataset is large, these values are normal because you are trying to calculate Euclidean distance in a multidimensional space. While I was doing my own project, I observed this situation, but there was no problem with my accuracy values. If we look at the elbow and silhouette examples on the internet, the distortion scores are low since most of them are examples given with small datasets (iris, wine quality, Etc.).
Upvotes: 1