Reputation: 949
Say I have the following points defined in a one dimensional space:
x = np.array([[0.70710678],
[0.70710678],
[0. ],
[1.41421356]])
I want to get m pair of points among these n points that have the longest euclidean distance between them (if m is 1 in this case will be 1.4142 and 0 )
I tried getting the pairwise distance with :
from scipy.spatial.distance import pdist, cdist
cdist(x,x, 'seuclidean')
from this part I'm not sure how to do the rest however.
Upvotes: 3
Views: 238
Reputation: 221534
We could make use of np.argpartition
on flattened distances off cdist
result -
dists = np.triu(cdist(x,x, 'seuclidean'),1)
s = dists.shape
idx = np.vstack(np.unravel_index(np.argpartition(dists.ravel(),-m)[-m:],s)).T
idx
would be m
pairs of indexes that are farthest, i.e. each row of idx
would represent indexes of one pair from x
.
Sample run -
# with m = 1
In [144]: idx
Out[144]: array([[2, 3]])
# with m = 2
In [147]: idx
Out[147]:
array([[1, 2],
[2, 3]])
# with m = 3
In [150]: idx
Out[150]:
array([[0, 3],
[1, 2],
[2, 3]])
Sample run on 2D
array -
In [44]: x
Out[44]:
array([[1.25, 1.25],
[1.25, 1.25],
[1.87, 1.87],
[0.62, 0.62],
[0.62, 0.62],
[1.25, 1.25],
[0. , 0. ],
[0.62, 0.62]])
In [45]: m = 2
In [46]: dists
Out[46]:
array([[0. , 0. , 1.58, 1.58, 1.58, 0. , 3.16, 1.58],
[0. , 0. , 1.58, 1.58, 1.58, 0. , 3.16, 1.58],
[0. , 0. , 0. , 3.16, 3.16, 1.58, 4.74, 3.16],
[0. , 0. , 0. , 0. , 0. , 1.58, 1.58, 0. ],
[0. , 0. , 0. , 0. , 0. , 1.58, 1.58, 0. ],
[0. , 0. , 0. , 0. , 0. , 0. , 3.16, 1.58],
[0. , 0. , 0. , 0. , 0. , 0. , 0. , 1.58],
[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ]])
In [47]: idx
Out[47]:
array([[0, 6],
[2, 6]])
Note that because of the way argpartition
works, idx
might not have the indices in their sorted order of distances. To force it that way, we could do -
idx[dists[tuple(idx.T)].argsort()]
Upvotes: 2
Reputation: 27869
To pair each point with it's furthest counterpart you can use:
np.dstack((x, x[cdist(x,x, 'seuclidean').argmax(axis=-1)]))
#array([[[0.70710678, 0. ]],
#
# [[0.70710678, 0. ]],
#
# [[0. , 1.41421356]],
#
# [[1.41421356, 0. ]]])
Upvotes: 1