tumbleweed
tumbleweed

Reputation: 4640

How to know to which matrix row corresponds each cluster label?

After doing clustering I end up with an object which stores all the cluster labels, something like this:

clusterer.labels_

The above is typically a list or an array. Then I always assign the labels to the original pandas dataframe (dataset) like this:

df['cluster_lables] = cluster.labels_

At the end I assume that each element of cluster.labels_ corresponds to each row to my original dataset, is that assumption correct? For example from the above column creation I end up with something like this:

ColA ColB cluster_labels
1    3       -1
2    4         2
...
89  90        45

Upvotes: 1

Views: 380

Answers (1)

RichardOwen
RichardOwen

Reputation: 21

Broadly yes, you are right. The type of clustering I have used before is the KMeans clustering (which can be found here https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html) but I can't guarantee they all work like that. Appending a new column onto the dataframe will work as you think it will.

Upvotes: 1

Related Questions