Reputation: 1825
I have a list of objects, and a distance metric between objects. Can I use scipy's hierarchical clustering to cluster the objects (fclust1 seems to only accept vectors of floats)?
Alternatively, if this is not possible in scipy, is there other python library in which this can be done?
Example:
class MyObject(object):
def __init__(self):
self.vec1 = [random.choice(range(100)) for i in range(1000)]
self.vec2 = [random.choice(range(100)) for i in range(1000)]
def my_distance_metric(a1, a2):
return some scalar function of a1.vec1, a1.vec2, a2.vec1, a2.vec2
objects = [MyObject() for in in range(1000)]
fclust1.cluster(objects, metric = my_distance_metric)
Thanks.
Upvotes: 0
Views: 448
Reputation: 114831
You can compute the condensed distance matrix of your objects and pass it to scipy.cluster.hierarchy.linkage
to compute the linkage matrix. Then pass the linkage matrix to, say, scipy.cluster.hierarchy.fcluster
or scipy.cluster.hierarchy.dendrogram
.
For example,
from scipy.cluster.hierarchy import linkage, dendrogram
n = len(objects)
condensed_dist = [my_distance_metric(objects[j], objects[k])
for j in range(n)
for k in range(j+1, n)]
Z = linkage(condensed_dist)
dendrogram(Z)
Upvotes: 1