Reputation: 11942
I have a 2d numpy array my_array
that starts out like this:
array([[1., 2., 3., 4., 5., 6., 7., 8., 9.],
[1., 2., 3., 4., 5., 6., 7., 8., 9.],
[1., 2., 3., 4., 5., 6., 7., 8., 9.],
[1., 2., 3., 4., 5., 6., 7., 8., 9.],
[1., 2., 3., 4., 5., 6., 7., 8., 9.],
[1., 2., 3., 4., 5., 6., 7., 8., 9.],
[1., 2., 3., 4., 5., 6., 7., 8., 9.],
[1., 2., 3., 4., 5., 6., 7., 8., 9.],
[1., 2., 3., 4., 5., 6., 7., 8., 9.]])
But after some processing which is irrelevant now looks like this:
array([[1., 2., 0., 4., 5., 6., 0., 8., 9.],
[0., 2., 0., 0., 5., 6., 7., 8., 9.],
[0., 2., 0., 4., 5., 0., 7., 0., 9.],
[1., 2., 0., 4., 5., 6., 7., 8., 9.],
[1., 2., 3., 4., 5., 0., 7., 8., 9.],
[0., 2., 0., 4., 5., 6., 0., 8., 9.],
[1., 2., 0., 4., 5., 6., 7., 8., 9.],
[1., 2., 0., 4., 5., 6., 7., 8., 9.],
[1., 2., 0., 4., 5., 6., 7., 8., 0.]])
As you can see, some of the items have been "zeroed out" quite randomly, but only the value of 3 was left with only 1 item that isn't zero. I'm looking for a function that takes this array and returns the index / row number that has the value 3
(or any other value that only appears once and only once in the array).
To explain this differently:
I first have to figure out if there is such an item that only appears once (in this example the answer is yes and that item is the number 3
), and then I need to return its row number (in this case 4
since the only line with 3
in it is: my_array[4]
)
I have successfully done that with iterating over the array, item by item, and counting the number of times each number appears (and returning only the item whose count is 1) and then iterating over everything a second time to find the correct index / row number of where that item is located.
This seems very inefficient, especially if the array will be larger. Is there a better way in numpy to do this?
EDIT: if the number that appears only once is 0
that shouldn't count, i'm only looking for the "column" that was zeroed-out completely except 1 item in it
Upvotes: 0
Views: 789
Reputation: 51175
Edit I wasn't even using the mask, you can just use the first and last lines:
x = np.array([[1., 2., 0., 4., 5., 6., 0., 8., 9.],
[0., 2., 0., 0., 5., 6., 7., 8., 9.],
[0., 2., 0., 4., 5., 0., 7., 0., 9.],
[1., 2., 0., 4., 5., 6., 7., 8., 9.],
[1., 2., 3., 4., 5., 0., 7., 8., 9.],
[0., 2., 0., 4., 5., 6., 0., 8., 9.],
[1., 2., 0., 4., 5., 6., 7., 8., 9.],
[1., 2., 0., 4., 5., 6., 7., 8., 9.],
[1., 2., 0., 4., 5., 6., 7., 8., 0.]])
res = (x == 3)
print(np.where(res * x)[0])
Output:
[4]
The full response to np.where()
is:
(array([4], dtype=int64), array([2], dtype=int64))
So if you wanted both the column and the row number, you could use both of these.
Upvotes: 2
Reputation: 32197
Try using the numpy.count_nonzero
method
numpy.count_nonzero(arr, axis=0)
This will count the non-zero values columnwise
Here's a Demo
I will leave the rest to you. Good Luck
Upvotes: 3