Reputation: 11
I have an array like this and I have to find the distance between each points. How could I do so in python with numpy?
array([[ 8139, 112607],
[ 8139, 115665],
[ 8132, 126563],
[ 8193, 113938],
[ 8193, 123714],
[ 8156, 120291],
[ 8373, 125253],
[ 8400, 131442],
[ 8400, 136354],
[ 8401, 129352],
[ 8439, 129909],
[ 8430, 135706],
[ 8430, 146359],
[ 8429, 139089],
[ 8429, 133243]])
Upvotes: 0
Views: 1270
Reputation: 5939
Let's minimize this problem down to 4 points:
points = np.array([[8139, 115665], [8132, 126563], [8193, 113938], [8193, 123714]])
In general, you need to do 2 steps:
np.hypot
for these pairs.There are many ways of how you would like to create pairs of indices for each pair of points you'd like to take. But where do they come from? In every case it's a good idea to start building them from adjancency matrix.
Case 1
In the most common way you can start from building it like so:
adjacency = np.ones(shape=(len(points), len(points)), dtype=bool)
>>> adjacency
[[ True True True True]
[ True True True True]
[ True True True True]
[ True True True True]]
It corresponds to indices you need to take like so:
adjacency_idx_view = np.transpose(np.nonzero(adjacency))
for n in adjacency_idx_view.reshape(len(points), len(points), 2):
>>> print(n.tolist())
[[0, 0], [1, 0], [2, 0], [3, 0]]
[[0, 1], [1, 1], [2, 1], [3, 1]]
[[0, 2], [1, 2], [2, 2], [3, 2]]
[[0, 3], [1, 3], [2, 3], [3, 3]]
And this is how you collect them:
x, y = np.nonzero(adjacency)
>>> np.transpose([x, y])
array([[0, 0],
[0, 1],
[0, 2],
[0, 3],
[1, 0],
[1, 1],
[1, 2],
[1, 3],
[2, 0],
[2, 1],
[2, 2],
[2, 3],
[3, 0],
[3, 1],
[3, 2],
[3, 3]], dtype=int64)
It could be done also manually like in @ Corralien's answer:
x = np.repeat(np.arange(len(points)), len(points))
y = np.tile(np.arange(len(points)), len(points))
Case 2
In previous case every pair of point is duplicated. There are also pairs with points duplicating. A better option is to omit this excessive data and take only pairs with index of first point being less than index of the second one:
adjacency = np.less.outer(np.arange(len(points)), np.arange(len(points)))
>>> print(adjacency)
[[False True True True]
[False False True True]
[False False False True]
[False False False False]]
x, y = np.nonzero(adjacency)
This is not used widely. Although this lays beyond the hood of np.triu_indices
. Hence, as an alternative, we could use:
x, y = np.triu_indices(len(points), 1)
And this results in:
>>> np.transpose([x, y])
array([[0, 1],
[0, 2],
[0, 3],
[0, 4],
[1, 2],
[1, 3],
[1, 4],
[2, 3],
[2, 4],
[3, 4]])
Case 3 You could also try omit only pairs of duplicated points and leave pairs with points being swapped. As in Case 1 it costs 2x memory and consumption time so I'll leave it for demonstration purposes only:
adjacency = ~np.identity(len(points), dtype=bool)
>>> adjacency
array([[False, True, True, True],
[ True, False, True, True],
[ True, True, False, True],
[ True, True, True, False]])
x, y = np.nonzero(adjacency)
>>> np.transpose([x, y])
array([[0, 1],
[0, 2],
[0, 3],
[1, 0],
[1, 2],
[1, 3],
[2, 0],
[2, 1],
[2, 3],
[3, 0],
[3, 1],
[3, 2]], dtype=int64)
I'll leave making x
and y
manually (without masking) as an exercise for the others.
np.hypot
Instead of np.sqrt(np.sum((a - b) ** 2, axis=1))
you could do np.hypot(np.transpose(a - b))
. I'll take my Case 2 as my index generator:
def distance(points):
x, y = np.triu_indices(len(points), 1)
x_coord, y_coord = np.transpose(points[x] - points[y])
return np.hypot(x_coord, y_coord)
>>> distance(points)
array([10898.00224812, 1727.84403231, 8049.18113848, 12625.14736548,
2849.65296133, 9776. ])
Upvotes: 1
Reputation: 120391
You can use np.repeat
and np.tile
to create all combinations then compute the euclidean distance:
xy = np.array([[8139, 115665], [8132, 126563], [8193, 113938], [8193, 123714],
[8156, 120291], [8373, 125253], [8400, 131442], [8400, 136354],
[8401, 129352], [8439, 129909], [8430, 135706], [8430, 146359],
[8429, 139089], [8429, 133243]])
a = np.repeat(xy, len(xy), axis=0)
b = np.tile(xy, [len(xy), 1])
d = np.sqrt(np.sum((a - b) ** 2, axis=1))
The output of d
is (196,) which is 14 x 14.
Update
but I have to do it in a function.
def distance(xy):
a = np.repeat(xy, len(xy), axis=0)
b = np.tile(xy, [len(xy), 1])
return np.sqrt(np.sum((a - b) ** 2, axis=1))
d = distance(xy)
Upvotes: 0