cristian hantig
cristian hantig

Reputation: 1310

How to compute a similarity metric between two Series containing lists?

Having the following Series:

a = pd.Series([[1,2,34], [2,3], [2,3,4,5,1]], index = [1,2,3])

1         [1, 2, 34]
2             [2, 3]
3    [2, 3, 4, 5, 1]

and the following metric:

def metric(x, y):
    return len(np.intersect1d(x, y))

I wanna compute a similarity metric over the Series and the result should be:

  1  2  3
1 3  1  2
2 1  2  2 
3 2  2  5

So far i used this:

sims = a.map(lambda x: a.map(lambda y: metric(x, y)))
pd.DataFrame({k: v for k,v in sims.items()})

I wanna know if there is another more elegant method that can achieve this.

Upvotes: 0

Views: 59

Answers (1)

Arn
Arn

Reputation: 2015

You might use pd.concat to join pd.Series objects together, it's more efficient.

pd.concat([a.apply(metric, args=(a.loc[y],)) for y in a.index], 1)

Upvotes: 1

Related Questions