Reputation: 81
When using K-means clustering method in sklearn, I clustered the points into two groups. How to set k_means.labels_ of the group with larger point numbers as ‘0’ (instead of 1)?
Thanks!
Upvotes: 1
Views: 985
Reputation: 2487
Generally, if you have fully labeled data, you should be using a classifier (see this excellent graphic). K-means is a partially random process, so there is no way to guarantee which cluster is assigned to which label.
Once you have the predictions, if you want to reverse the class labels, you can do something like this:
predictions = k_means.fit_predict( my_data )
if sum( predicitons==1 ) > sum( predictions==0 ):
corrected_predictions = predictions.copy()
corrected_predictions[ predictions==1 ] = 0
corrected_predictions[ predictions==0 ] = 1
Mucking about with the automatically computed members of a class (like k_means.labels_
) is not recommmended.
Upvotes: 2