user4544869
user4544869

Reputation: 53

how to print the optimal number of clusters using fviz_nbclust

I need a help to know how to find the optimal number of number of clusters using k-means cluster in R.

My code is

library(cluster)
library(factoextra)


#read data
data<-read.csv("..\file.txt",header=FALSE, sep=" ")

#determine number of clusters to use
k.max<- 22
wss <- sapply(2:k.max, function(k){kmeans(data, k, nstart=10 )$tot.withinss})

print(wss)

plot(2:k.max, wss, type="b", pch = 19,  xlab="Number of clusters K", ylab="Total within-clusters sum of squares")


fviz_nbclust(data, kmeans, method = "wss") + geom_vline(xintercept = 3, linetype = 2)

I get the plot, but I still do not know how to find the number?

Thanks

My plot is in this link to show the rlation between wss and number of clusters with no information about the optimal number of clusters

Upvotes: 3

Views: 10858

Answers (2)

Yan.Li
Yan.Li

Reputation: 41

n_clust<-fviz_nbclust(df, kmeans, method = "silhouette",k.max = 30)
n_clust<-n_clust$data
max_cluster<-as.numeric(n_clust$clusters[which.max(n_clust$y)])

Upvotes: 4

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77454

There is no sound mathematical definition of the "elbow" (because of having different scales on x and y, there is no angle), and in plots like yours there probably is no "elbow" at all.

Most likely, k-means did not work for any k. This happens quite often. For example if your data doesn't contain clusters.

Try generating uniform data, and do the same plot - it will look similar.

Upvotes: 0

Related Questions