user8510273
user8510273

Reputation:

Calculate jaccard distance using scipy in python

I have two separate lists as follows.

list1 =[[0.0, 0.75, 0.2], [0.0, 0.5, 0.7]]
list2 =[[0.9, 0.0, 0.8], [0.0, 0.0, 0.8], [1.0, 0.0, 0.0]]

I want to get a list1 x list2 jaccard distance matrix (i.e. the matrix includes 6 values: 2 x 3)

    For example;
[0.0, 0.75, 0.2] in list1 with all the three lists in list2
[0.0, 0.5, 0.7] in list1 with all the three lists in list2

I actually tried both pdist and cdist. However I get the following errors respectively; TypeError: pdist() got multiple values for argument 'metric' and ValueError: XA must be a 2-dimensional array..

Please help me to fix this issue.

Upvotes: 0

Views: 1262

Answers (1)

Lescurel
Lescurel

Reputation: 11631

You need to pass to pdist a m x n 2D array. To construct it, you can use a simple nested loop. You could probably do something like this :

import scipy.spatial.distance as dist

list1 =[[0.0, 0.75, 0.2], [0.0, 0.5, 0.7]]
list2 =[[0.9, 0.0, 0.8], [0.0, 0.0, 0.8], [1.0, 0.0, 0.0]]
distance = []
for elem1 in list1:
    for elem2 in list2:
        distance.append(dist.pdist([elem1,elem2], 'jaccard'))

You get your results in the distance array.

Upvotes: 1

Related Questions