Reputation: 103
My data has 61 rows and 56 columns. I have tested several clustering algorithms and i will later evaluate them, but I found some problems. I just succeed to apply the silhouette coefficient. I have performed K means clustering using this code:
kmean = KMeans(n_clusters=6)
kmean.fit(X)
kmean.labels_
#Evaluation
silhouette_score(X,kmean.labels_)
==>0.09231070598844496
I would like to try more measurements such as :
metrics.homogeneity_score,
metrics.completeness_score,
metrics.v_measure_score,
metrics.adjusted_rand_score,
metrics.adjusted_mutual_info_score,
I want to evaluate my clustering. And I do not know how. what they mean by labels_true
, labels_pred
? how can I use the sklearn evaluations metrics?
Upvotes: 0
Views: 2686
Reputation: 21
labels_true
: Ground Truth values/Actual labels
labels_pred
: Labels predicted using clustering model
For example:
labels_pred = clustering_model.predict(model_df.values)
All the below metrics needs ground truth, its not internal metric:
metrics.homogeneity_score,
metrics.completeness_score,
metrics.v_measure_score,
metrics.adjusted_rand_score,
metrics.adjusted_mutual_info_score,
You can try silhouette_score
or calinski_harabasz_score
or davies_bouldin_score
or dunn_index
Upvotes: 1