ziiho_
ziiho_

Reputation: 43

kmeans clustering python

There are coordinates that I want to cluster. The result of clustering using the kmeans

[[0, 107], [0, 108], [0, 109], [0, 115], [0, 116],
[0, 117], [0, 118], [0, 125], [0, 126], [0, 127],
[0, 128], [0, 135], [0, 136], [0, 194], [0, 195],
[1, 107], [1, 108], [1, 109], [1, 110], [1, 114],
[1, 115], [1, 116], [1, 117], [1, 118], [1, 119]...]

The result of clustering using the kmeans

from sklearn.cluster import KMeans
num_clusters = 9
km = KMeans(n_clusters=num_clusters)
km_fit = km.fit(nonzero_pred_sub)

>>>array([7, 7, 7, 1, 1, 1, 1, 5, 5, 5, 5, 3, 3, 0, 0, 7, 7, 7, 7, 1, 1, 1,
   1, 1, 1, 5, 5, 5...]

I want to know the coordinates of i-th cluster for example, i need elements of 1st cluster and i can assume [0, 107], [0, 108], [0, 109] was clustered into the 7-th cluster. How can i get coordinates from cluster?

Upvotes: 0

Views: 367

Answers (1)

SpaceBurger
SpaceBurger

Reputation: 549

I assume you want the coordinates affected to the 7th cluster. You can do so by storing you result in a dictionary :

from sklearn.cluster import KMeans
km = KMeans(n_clusters=9)
km_fit = km.fit(nonzero_pred_sub)

d = dict() # dictionary linking cluster id to coordinates
for i in range(len(km_fit)):
  cluster_id = km_fit[i]

  if cluster_id not in d:
    d[cluster_id] = []
    
  d[cluster_id].append(nonzero_pred_sub[i])

# that way you can access the 7th cluster coordinates like this
d[7]

>>> [[0, 107], [0, 108], [0, 109], [1, 107], [1, 108], [1, 109], [1, 110], ...]

To remove the "if" section in the loop, you can try looking into defaultdict objects.

You can also surely manage that with pandas dataframes to make manipulating more complex results easier.

If I misunderstood you question, and what you want is the coordinates of the center of the i-th cluster, you can get this by calling km_fit.cluster_centers_[i] (cf. doc).

Upvotes: 1

Related Questions