Reputation: 737
I'm trying to get the index values out of a numpy array, I've tried using intersects instead to no avail. I'm simply trying to find like values in 2 arrays. One is 2D and I'm selecting a column, and the other is 1D, just a list of values to search for, so effectively just 2 1D arrays.
We'll call this array a:
array([[ 1, 97553, 1],
[ 1, 97587, 1],
[ 1, 97612, 1],
[ 1, 97697, 1],
[ 1, 97826, 3],
[ 1, 97832, 1],
[ 1, 97839, 1],
[ 1, 97887, 1],
[ 1, 97944, 1],
[ 1, 97955, 2]])
And we're searching say, values = numpy.array([97612, 97633, 97697, 97999, 97943, 97944])
So I try:
numpy.where(a[:, 1] == values)
And I'd expect a bunch of indices of the values, but instead I get back an array that's empty, it spits out [(array([], dtype=int64),)]
.
If I try this though:
numpy.where(a[:, 1] == 97697)
It gives me back (array([2]),)
, which is what I would expect.
What weirdness of arrays am I missing here? Or is there maybe even an easier way to do this? Finding array indices and matching arrays seems to not work as I expect at all. When I want to find the unions or intersects of arrays, by indice or unique value it just doesn't seem to function. Any help would be super. Thanks.
Edit: As per Warrens request:
import numpy
a = numpy.array([[ 1, 97553, 1],
[ 1, 97587, 1],
[ 1, 97612, 1],
[ 1, 97697, 1],
[ 1, 97826, 3],
[ 1, 97832, 1],
[ 1, 97839, 1],
[ 1, 97887, 1],
[ 1, 97944, 1],
[ 1, 97955, 2]])
values = numpy.array([97612, 97633, 97697, 97999, 97943, 97944])
I've found that numpy.in1d
will give me a correct truth table of booleans for the operation, with a 1d array of the same length that should map to the original data. My only issue here is now how to act with that, for instance deleting or modifying the original array at those indices. I could do it laboriously with a loop, but as far as I know there are better ways in numpy. Truth tables as masks are supposed to be quite powerful with numpy from what I have been able to find.
Upvotes: 1
Views: 22655
Reputation: 5294
np.where
with a single argument is equivalent to np.nonzero
. It gives you the indices where a condition, the input array, is True
.
In your example you are checking for element-wise equality between a[:,1]
and values
a[:, 1] == values
False
So it's giving you the correct result: no index in the input is True
.
You should use np.isin
instead
np.isin(a[:,1], values)
array([False, False, True, True, False, False, False, False, True, False], dtype=bool)
Now you can use np.where
to get the indices
np.where(np.isin(a[:,1], values))
(array([2, 3, 8]),)
and use those to address the original array
a[np.where(np.isin(a[:,1], values))]
array([[ 1, 97612, 1],
[ 1, 97697, 1],
[ 1, 97944, 1]])
Your initial solution with a simple equality check could indeed have worked with proper broadcasting
:
np.where(a[:,1] == values[..., np.newaxis])[1]
array([2, 3, 8])
EDIT: given you seem to have issues with using the above results to index and manipulate your array here's a couple of simple examples
Now you should have two ways of accessing your matching elements in the original array, either the binary mask or the indices from np.where
.
mask = np.isin(a[:,1], values) # np.in1d if np.isin is not available
idx = np.where(mask)
Let's say you want to set all matching rows to zero
a[mask] = 0 # or a[idx] = 0
array([[ 1, 97553, 1],
[ 1, 97587, 1],
[ 0, 0, 0],
[ 0, 0, 0],
[ 1, 97826, 3],
[ 1, 97832, 1],
[ 1, 97839, 1],
[ 1, 97887, 1],
[ 0, 0, 0],
[ 1, 97955, 2]])
Or you want to multiply the third column of matching rows by 100
a[mask, 2] *= 100
array([[ 1, 97553, 1],
[ 1, 97587, 1],
[ 1, 97612, 100],
[ 1, 97697, 100],
[ 1, 97826, 3],
[ 1, 97832, 1],
[ 1, 97839, 1],
[ 1, 97887, 1],
[ 1, 97944, 100],
[ 1, 97955, 2]])
Or you want to delete matching rows (here using indices is more convenient than masks)
np.delete(a, idx, axis=0)
array([[ 1, 97553, 1],
[ 1, 97587, 1],
[ 1, 97826, 3],
[ 1, 97832, 1],
[ 1, 97839, 1],
[ 1, 97887, 1],
[ 1, 97955, 2]])
Upvotes: 11
Reputation: 86
Just a thought:
Try to flatten the 2D array and compare using numpy.intersect1d.
https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.ndarray.flatten.html
https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.intersect1d.html
Upvotes: 1