Reputation: 1129
I have a numpy mask with shape [x_len,y_len,z_len]. I wish to find the z such that np.count_nonzero(mask[:,:,z]) is maximised.
My naive solution:
best_z = -1
best_score = -1
for z in range(mask.shape[2]):
n_nonzero = np.count_nonzero(mask[:,:,z])
if n_nonzero > best_score:
best_score = n_nonzero
best_z = z
But I'm looking for something faster and/or prettier.
Upvotes: 2
Views: 83
Reputation: 4148
I guess this is what you need:
best_z = np.argmax(np.count_nonzero(mask, axis=-1))
EDIT: made error, axis should be (0, 1)
:
best_z = np.argmax(np.count_nonzero(mask, axis=(0, 1))
thanks mcsoini for noticing
Upvotes: 3
Reputation: 1744
np.argmax(np.count_nonzero(foo, axis=(0, 1)))
yields the z-index of foo
for which there are maximum non-zero elements.
For a comparison of this solution, with @mcsoini's solution and another novel one:
foo = np.random.randint(0, 2, size=(100, 100, 200))
# this solution
i> %timeit np.argmax(np.count_nonzero(foo, axis=(0, 1)))
o> 1.58 ms ± 43.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# @mcsoini's solution
i> %timeit np.argmax(np.count_nonzero(foo.reshape(-1, foo.shape[-1]), axis=0))
o> 1.64 ms ± 18.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# a trick solution
i> %timeit np.argmax(np.sum(foo, axis = (0, 1)))
o> 709 µs ± 4.87 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
The last solution takes half the time of the other two solutions. We can afford this trick since a mask is effectively a matrix of 0
and 1
values. It won't work if there are other values.
Further comments:
It seems like all these methods take exactly the same time (within margin of error) if foo
is of type bool
(which a mask is expected to be), indicating, perhaps under the hood, count_nonzero
for boolean values is very similar to sum
? I don't know, though, it would be nice to have some insight.
Upvotes: 4
Reputation: 1129
I came up with this:
unique, counts = np.unique(np.where(mask)[2], return_counts=True)
best_z = unique[np.argmax(counts)]
Although I expect dankal and mcsoini's answers are both faster.
Upvotes: 1
Reputation: 6642
You are looking for the index along the z-axis corresponding to the array's slice with the largest number of non-zero elements. With the example data
np.random.seed(3)
mask = np.random.randn(2, 3, 4)
mask = np.where(mask < 0, 0, mask)
print(mask)
[[[1.78862847 0.43650985 0.09649747 0. ]
[0. 0. 0. 0. ]
[0. 0. 0. 0.88462238]]
[[0.88131804 1.70957306 0.05003364 0. ]
[0. 0. 0.98236743 0. ]
[0. 0. 1.48614836 0.23671627]]]
we can first reshape the array mask.reshape(-1, mask.shape[-1])
in order to reduce the dimensions 0 and 1 to a single dimension. Then we count the number of non-zeros along this new first dimension p.count_nonzero(..., axis=0)
, and finally we can find the indices along z where those counts are maximum (np.argmax
):
np.argmax(np.count_nonzero(mask.reshape(-1, mask.shape[-1]), axis=0))
Result: 2
Upvotes: 2