JZ.
JZ.

Reputation: 81

How to label the group with larger point numbers as ‘0’ in sklearn k-means clustering

When using K-means clustering method in sklearn, I clustered the points into two groups. How to set k_means.labels_ of the group with larger point numbers as ‘0’ (instead of 1)?

Thanks!

Upvotes: 1

Views: 985

Answers (1)

Andreus
Andreus

Reputation: 2487

Generally, if you have fully labeled data, you should be using a classifier (see this excellent graphic). K-means is a partially random process, so there is no way to guarantee which cluster is assigned to which label.

Once you have the predictions, if you want to reverse the class labels, you can do something like this:

predictions = k_means.fit_predict( my_data )
if sum( predicitons==1 ) > sum( predictions==0 ):
    corrected_predictions = predictions.copy()
    corrected_predictions[ predictions==1 ] = 0
    corrected_predictions[ predictions==0 ] = 1

Mucking about with the automatically computed members of a class (like k_means.labels_) is not recommmended.

Upvotes: 2

Related Questions