Urvish
Urvish

Reputation: 713

How to get name from the cluster from KMeans clustering?

I am clustering the traders' data from past into cluster using Kmeans. I have 10 traders and I am clustering into 3 clusters. After getting clusters and labels of each index now I want to know the name of the traders each cluster has. For example if Cluster-0 has 3 traders then the output should be something like {'Cluster0': 'Name1','Name2','Name3'} {'Cluster1': 'Name5','Name4','Name6'} and so on and so forth. I was able to get the index of data points which belong to each cluster by

cluster_dict = {i: np.where(data['Labels'] == i) for i in range(n_clusters)} Then I have list of index from new trader data starts like 0-16 trader1, 16-32 trader2 and like that. I also have name of traders in list as ['name1','name2','name3'].

Is there any way to get back the name of trader belongs to each cluster as I stated above. If yes then please help me with this.

Upvotes: 1

Views: 6528

Answers (1)

Mohamed Thasin ah
Mohamed Thasin ah

Reputation: 11192

I think you need something like below,

First get label value and assign that into your dataframe, then apply groupby by based on label and find unique in name (A,B,C) column and store the result.

Following code snippet demonstrates your problem.

from sklearn.cluster import KMeans
import numpy as np
import pandas as pd
X = pd.DataFrame([[1, 2,'A'], [1, 4,'A'], [1, 0,'B'],[4, 2,'C'], [4, 4,'C'], [4, 0,'B']])
kmeans = KMeans(n_clusters=2, random_state=0).fit(X[[0,1]])
result= kmeans.labels_
X['label']=result
print X.groupby('label')[2].unique()

Output:

label
0    [A, B]
1    [C, B]

For Dict representation ,

print X.groupby('label')[2].unique().to_dict()

Output:

{0: array(['A', 'B'], dtype=object), 1: array(['C', 'B'], dtype=object)}

To get the result in same dataframe use below,

X['cluster_name']= X.groupby('label')[2].transform('unique')

Output:

   0  1  2  label cluster_name
0  1  2  A      0       [A, B]
1  1  4  A      0       [A, B]
2  1  0  B      0       [A, B]
3  4  2  C      1       [C, B]
4  4  4  C      1       [C, B]
5  4  0  B      1       [C, B]

Upvotes: 1

Related Questions