ab20225
ab20225

Reputation: 103

How to evaluate clustering algorithm in python?

My data has 61 rows and 56 columns. I have tested several clustering algorithms and i will later evaluate them, but I found some problems. I just succeed to apply the silhouette coefficient. I have performed K means clustering using this code:

kmean = KMeans(n_clusters=6)
kmean.fit(X)
kmean.labels_
#Evaluation
silhouette_score(X,kmean.labels_)
 ==>0.09231070598844496

I would like to try more measurements such as :

metrics.homogeneity_score,
metrics.completeness_score,
metrics.v_measure_score,
metrics.adjusted_rand_score,
metrics.adjusted_mutual_info_score,

I want to evaluate my clustering. And I do not know how. what they mean by labels_true, labels_pred? how can I use the sklearn evaluations metrics?

Upvotes: 0

Views: 2686

Answers (1)

Akshay Ijantkar
Akshay Ijantkar

Reputation: 21

labels_true: Ground Truth values/Actual labels

labels_pred: Labels predicted using clustering model

For example:

labels_pred = clustering_model.predict(model_df.values)

All the below metrics needs ground truth, its not internal metric:

metrics.homogeneity_score,
metrics.completeness_score,
metrics.v_measure_score,
metrics.adjusted_rand_score,
metrics.adjusted_mutual_info_score,

You can try silhouette_score or calinski_harabasz_score or davies_bouldin_score or dunn_index

Upvotes: 1

Related Questions