Reputation: 17
i'm working on big distance matrix (10-80k row ; 3k cols) and i want to get custom pairwise distance on that matrix ; and do it fast. I have trying with armadillo but with huge data it still "slow" I try with torch with cuda acceleration and with built in euclidean distance that realy so fast (100 times faster). So now i want to make custom pairwise distance like : for pairwise row (a and b): get the standard deviation of ai*bi (where i is cols) for example :
my_mat:
|1 |2 |3 |4
a |5 |3 |0 |4
b |1 |6 |2 |3
a//b dist = std(5*1,3*6,0*2,4*3)
= std(5,18,0,12)
= 7.889867
i think about : start with my two dimension (N,M) tensor (my_mat) create a new tensor with 3 dimension (N,N,P) and in P dimension store a "list" with each pairwise product by cols :
3_dim_tens :
|a |b
a |Pdim(5*5,3*3,0*0,4*4) |Pdim(5*1,3*6,0*2,4*3)
b |Pdim(5*1,3*6,0*2,4*3) |Pdim(5*5,3*3,0*0,4*4)
then if i reduce Pdim by std() i will have 2 dims (N,N) pairwise matrix with my custom distance. (typically is like matmul my_mat * t(my_mat) but with std in place of addition)
is it possible to do this with torch or is there another way for custom pairwise distance?
Upvotes: 0
Views: 356
Reputation: 11628
I think the most intuitive way is using einsum
for this:
import torch
a = torch.tensor([[5.0, 3, 0, 4],[1, 6, 2, 3]])
b = torch.einsum('ij,kj->ikj', a, a).std(dim=2)
print(b)
Upvotes: 1