Brent
Brent

Reputation: 729

Python - Efficient way to find the largest area of a specific value in a 2D numpy array

I have a 2D numpy array where some values are zero, and some are not. I'm trying to find an efficient way to find the biggest clump of zeros in the array (by returning the number of zeros, as well as a rough idea of where the center is)

For example in this array, I would like to find the clump of 9, with the center of (3,4):

[[ 1, 1, 1, 0, 0 ],
 [ 1, 0, 1, 1, 0 ],
 [ 1, 1, 1, 1, 1 ],
 [ 1, 1, 0, 0, 0 ],
 [ 1, 1, 0, 0, 0 ],
 [ 1, 1, 0, 0, 0 ]]

Is there a nice vectorized way to accomplish something like this in numpy or scipy?

The clumps will be roughly circular in shape, and have no holes in them.

ndimage.label() from scipy does something close to this, but isn't quite what I'm after. I have a feeling numpy.where() and numpy.diff() could be helpful, but not sure how to efficiently use them to solve this problem.

Upvotes: 12

Views: 3152

Answers (1)

Bi Rico
Bi Rico

Reputation: 25813

You're almost there, you just need to combine ndimage.label with numpy.bincount:

import numpy as np
from scipy import ndimage

array = np.random.randint(0, 3, size=(200, 200))

label, num_label = ndimage.label(array == 0)
size = np.bincount(label.ravel())
biggest_label = size[1:].argmax() + 1
clump_mask = label == biggest_label

Once you have clump_mask you can compute the centroid or use some other method to get the center.

Upvotes: 14

Related Questions