Reputation: 8090
I have the following indices as you would get them from np.where(...)
:
coords = (
np.asarray([0 0 0 1 1 1 1 1 2 2 2 3 3 3 3 4 4 4 5 5 5 5 5 6 6 6]),
np.asarray([2 2 8 2 2 4 4 6 2 2 6 2 2 4 6 2 2 6 2 2 4 4 6 2 2 6]),
np.asarray([0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]),
np.asarray([0 1 0 0 1 0 1 1 0 1 1 0 1 1 1 0 1 1 0 1 0 1 1 0 1 1])
)
Another tuple with indices is meant to select those that are in coords
:
index = tuple(
np.asarray([0 0 1 1 1 1 2 2 2 3 3 3 3 4 4 4 5 5 5 5 5 6 6 6]),
np.asarray([2 8 2 4 4 6 2 2 6 2 2 4 6 2 2 6 2 2 4 4 6 2 2 6]),
np.asarray([0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]),
np.asarray([0 0 1 0 1 1 0 1 1 0 1 1 1 0 1 1 0 1 0 1 1 0 1 1])
)
So for instance, coords[0] is selected because it's in index (at position 0), but coords[1]
isn't selected because it's not available in index
.
I can calculate the mask easily with [x in zip(*index) for x in zip(*coords)]
(converted from bool to int for better readability):
[1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
but this wouldn't be very efficient for larger arrays. Is there a more "numpy-based" way that could calculate the mask?
Upvotes: 1
Views: 545
Reputation: 53099
You can use np.ravel_multi_index
to compress the columns into unique numbers which are easier to handle:
cmx = *map(np.max, coords),
imx = *map(np.max, index),
shape = np.maximum(cmx, imx) + 1
ct = np.ravel_multi_index(coords, shape)
it = np.ravel_multi_index(index, shape)
it.sort()
result = ct == it[it.searchsorted(ct)]
print(result.view(np.int8))
Prints:
[1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
Upvotes: 1
Reputation: 5294
Not so sure about efficiency but given you're basically comparing coordinates pairs you could use scipy
distance functions. Something along:
from scipy.spatial.distance import cdist
c = np.stack(coords).T
i = np.stack(index).T
d = cdist(c, i)
In [113]: np.any(d == 0, axis=1).astype(int)
Out[113]:
array([1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1])
By default it uses L2 norm, you could probably make it slightly faster with a simpler distance function, e.g.:
d = cdist(c,i, lambda u, v: np.all(np.equal(u,v)))
np.any(d != 0, axis=1).astype(int)
Upvotes: 1