Reputation: 729
I have a 2D numpy array where some values are zero, and some are not. I'm trying to find an efficient way to find the biggest clump of zeros in the array (by returning the number of zeros, as well as a rough idea of where the center is)
For example in this array, I would like to find the clump of 9, with the center of (3,4):
[[ 1, 1, 1, 0, 0 ],
[ 1, 0, 1, 1, 0 ],
[ 1, 1, 1, 1, 1 ],
[ 1, 1, 0, 0, 0 ],
[ 1, 1, 0, 0, 0 ],
[ 1, 1, 0, 0, 0 ]]
Is there a nice vectorized way to accomplish something like this in numpy or scipy?
The clumps will be roughly circular in shape, and have no holes in them.
ndimage.label() from scipy does something close to this, but isn't quite what I'm after. I have a feeling numpy.where() and numpy.diff() could be helpful, but not sure how to efficiently use them to solve this problem.
Upvotes: 12
Views: 3152
Reputation: 25813
You're almost there, you just need to combine ndimage.label
with numpy.bincount
:
import numpy as np
from scipy import ndimage
array = np.random.randint(0, 3, size=(200, 200))
label, num_label = ndimage.label(array == 0)
size = np.bincount(label.ravel())
biggest_label = size[1:].argmax() + 1
clump_mask = label == biggest_label
Once you have clump_mask
you can compute the centroid or use some other method to get the center.
Upvotes: 14