jlcv
jlcv

Reputation: 1808

Quickly find the min and max coordinates of connected component in a large image

I have an image that is of size 50000x50000. It has around 25000 connected different connected components. I'm using ndimage.label to label each of them and then I find the non zero points and finally get the min x, max x, min y and max y values. However, I have to find these coordinates is for each of the 25000 connected components. This is expensive as I have to run np.nonzero on the 50000x50000 image 25000 times. Here is a snippet of the code doing what I just mentioned.

im, _ = ndimage.label(im)
num_instances = np.max(np.max(im))
for instance_id in range(1,num_instances+1):
    im_inst = im == instance_id 
    points = np.nonzero(im_inst) # running this is expensive as im is 50000x50000

    cropped_min_x_1 = np.min(points[0])
    cropped_min_y_1 = np.min(points[1]) 
    cropped_max_x_1 = np.max(points[0])+1 
    cropped_max_y_1 = np.max(points[1])+1

Does anyone know what I can do to significantly speed up this process?

Upvotes: 1

Views: 1197

Answers (1)

Paul Panzer
Paul Panzer

Reputation: 53029

If the fraction of labelled pixels is not too large:

nz = np.flatnonzero(im)
order = np.argsort(im.ravel()[nz])
nz = nz[order]
blocks = np.searchsorted(im.ravel()[nz], np.arange(2, num_instances+1))
# or (which is faster will depend on numbers)
blocks = 1 + np.where(np.diff(im.ravel()[nz]))[0]
coords = np.array(np.unravel_index(nz, (50000, 50000)))
groups = np.split(coords, blocks, axis=-1)

groups will be a list of 2xn_i coordinates where n_i is the size of component i.

Upvotes: 1

Related Questions