user1767774
user1767774

Reputation: 1825

Using scipy hierarchical clustering with objects

I have a list of objects, and a distance metric between objects. Can I use scipy's hierarchical clustering to cluster the objects (fclust1 seems to only accept vectors of floats)?

Alternatively, if this is not possible in scipy, is there other python library in which this can be done?

Example:

 class MyObject(object):

     def __init__(self):
       self.vec1 = [random.choice(range(100)) for i in range(1000)]
       self.vec2 = [random.choice(range(100)) for i in range(1000)]

 def my_distance_metric(a1, a2):

      return some scalar function of a1.vec1, a1.vec2, a2.vec1, a2.vec2

 objects = [MyObject() for in in range(1000)]
 fclust1.cluster(objects, metric = my_distance_metric)

Thanks.

Upvotes: 0

Views: 448

Answers (1)

Warren Weckesser
Warren Weckesser

Reputation: 114831

You can compute the condensed distance matrix of your objects and pass it to scipy.cluster.hierarchy.linkage to compute the linkage matrix. Then pass the linkage matrix to, say, scipy.cluster.hierarchy.fcluster or scipy.cluster.hierarchy.dendrogram.

For example,

from scipy.cluster.hierarchy import linkage, dendrogram

n = len(objects)
condensed_dist = [my_distance_metric(objects[j], objects[k])
                      for j in range(n)
                          for k in range(j+1, n)]

Z = linkage(condensed_dist)
dendrogram(Z)

Upvotes: 1

Related Questions