Ofer Sadan
Ofer Sadan

Reputation: 11942

Count values in numpy array and return index by result

I have a 2d numpy array my_array that starts out like this:

array([[1., 2., 3., 4., 5., 6., 7., 8., 9.],
       [1., 2., 3., 4., 5., 6., 7., 8., 9.],
       [1., 2., 3., 4., 5., 6., 7., 8., 9.],
       [1., 2., 3., 4., 5., 6., 7., 8., 9.],
       [1., 2., 3., 4., 5., 6., 7., 8., 9.],
       [1., 2., 3., 4., 5., 6., 7., 8., 9.],
       [1., 2., 3., 4., 5., 6., 7., 8., 9.],
       [1., 2., 3., 4., 5., 6., 7., 8., 9.],
       [1., 2., 3., 4., 5., 6., 7., 8., 9.]])

But after some processing which is irrelevant now looks like this:

array([[1., 2., 0., 4., 5., 6., 0., 8., 9.],
       [0., 2., 0., 0., 5., 6., 7., 8., 9.],
       [0., 2., 0., 4., 5., 0., 7., 0., 9.],
       [1., 2., 0., 4., 5., 6., 7., 8., 9.],
       [1., 2., 3., 4., 5., 0., 7., 8., 9.],
       [0., 2., 0., 4., 5., 6., 0., 8., 9.],
       [1., 2., 0., 4., 5., 6., 7., 8., 9.],
       [1., 2., 0., 4., 5., 6., 7., 8., 9.],
       [1., 2., 0., 4., 5., 6., 7., 8., 0.]])

As you can see, some of the items have been "zeroed out" quite randomly, but only the value of 3 was left with only 1 item that isn't zero. I'm looking for a function that takes this array and returns the index / row number that has the value 3 (or any other value that only appears once and only once in the array).

To explain this differently:

I first have to figure out if there is such an item that only appears once (in this example the answer is yes and that item is the number 3), and then I need to return its row number (in this case 4 since the only line with 3 in it is: my_array[4])

I have successfully done that with iterating over the array, item by item, and counting the number of times each number appears (and returning only the item whose count is 1) and then iterating over everything a second time to find the correct index / row number of where that item is located.

This seems very inefficient, especially if the array will be larger. Is there a better way in numpy to do this?

EDIT: if the number that appears only once is 0 that shouldn't count, i'm only looking for the "column" that was zeroed-out completely except 1 item in it

Upvotes: 0

Views: 789

Answers (2)

user3483203
user3483203

Reputation: 51175

Edit I wasn't even using the mask, you can just use the first and last lines:

x = np.array([[1., 2., 0., 4., 5., 6., 0., 8., 9.],
       [0., 2., 0., 0., 5., 6., 7., 8., 9.],
       [0., 2., 0., 4., 5., 0., 7., 0., 9.],
       [1., 2., 0., 4., 5., 6., 7., 8., 9.],
       [1., 2., 3., 4., 5., 0., 7., 8., 9.],
       [0., 2., 0., 4., 5., 6., 0., 8., 9.],
       [1., 2., 0., 4., 5., 6., 7., 8., 9.],
       [1., 2., 0., 4., 5., 6., 7., 8., 9.],
       [1., 2., 0., 4., 5., 6., 7., 8., 0.]])

res = (x == 3)
print(np.where(res * x)[0])

Output:

[4]

The full response to np.where() is:

(array([4], dtype=int64), array([2], dtype=int64))

So if you wanted both the column and the row number, you could use both of these.

Upvotes: 2

sshashank124
sshashank124

Reputation: 32197

Try using the numpy.count_nonzero method

numpy.count_nonzero(arr, axis=0)

This will count the non-zero values columnwise
Here's a Demo

I will leave the rest to you. Good Luck

Upvotes: 3

Related Questions