How to compute a similarity metric between two Series containing lists?

Question

Having the following Series:

a = pd.Series([[1,2,34], [2,3], [2,3,4,5,1]], index = [1,2,3])

1         [1, 2, 34]
2             [2, 3]
3    [2, 3, 4, 5, 1]

and the following metric:

def metric(x, y):
    return len(np.intersect1d(x, y))

I wanna compute a similarity metric over the Series and the result should be:

So far i used this:

sims = a.map(lambda x: a.map(lambda y: metric(x, y)))
pd.DataFrame({k: v for k,v in sims.items()})

I wanna know if there is another more elegant method that can achieve this.

Arn · Accepted Answer

You might use pd.concat to join pd.Series objects together, it's more efficient.

pd.concat([a.apply(metric, args=(a.loc[y],)) for y in a.index], 1)

Answers (1)