Python - Efficient way to find the largest area of a specific value in a 2D numpy array

Question

I have a 2D numpy array where some values are zero, and some are not. I'm trying to find an efficient way to find the biggest clump of zeros in the array (by returning the number of zeros, as well as a rough idea of where the center is)

For example in this array, I would like to find the clump of 9, with the center of (3,4):

[[ 1, 1, 1, 0, 0 ],
 [ 1, 0, 1, 1, 0 ],
 [ 1, 1, 1, 1, 1 ],
 [ 1, 1, 0, 0, 0 ],
 [ 1, 1, 0, 0, 0 ],
 [ 1, 1, 0, 0, 0 ]]

Is there a nice vectorized way to accomplish something like this in numpy or scipy?

The clumps will be roughly circular in shape, and have no holes in them.

ndimage.label() from scipy does something close to this, but isn't quite what I'm after. I have a feeling numpy.where() and numpy.diff() could be helpful, but not sure how to efficiently use them to solve this problem.

Bi Rico · Accepted Answer

You're almost there, you just need to combine ndimage.label with numpy.bincount:

import numpy as np
from scipy import ndimage

array = np.random.randint(0, 3, size=(200, 200))

label, num_label = ndimage.label(array == 0)
size = np.bincount(label.ravel())
biggest_label = size[1:].argmax() + 1
clump_mask = label == biggest_label

Once you have clump_mask you can compute the centroid or use some other method to get the center.

Python - Efficient way to find the largest area of a specific value in a 2D numpy array

Answers (1)

Related Questions