Efficiently find centroid of labelled image regions

I have a segmented image as a 2 dimensional matrix of unique labels 1 ... k. For example:

img = 
    [1 1 2 2 2 2 2 3 3]
    [1 1 1 2 2 2 2 3 3]
    [1 1 2 2 2 2 3 3 3]
    [1 4 4 4 2 2 2 2 3]
    [4 4 4 5 5 5 2 3 3]
    [4 4 4 5 5 6 6 6 6]
    [4 4 5 5 5 5 6 6 6]

I am trying to determine the region centroids. That is, per label, what is the X,Y coordinate of the center of mass? For example, the centroid of label 1 is (1.25, 0.625). Just average up the row numbers ((0 + 0 + 1 + 1 + 1 + 2 + 2 + 3) / 8 = 1.25) and the column numbers ((0 + 0 + 0 + 0 + 1 + 1 + 1 + 2) / 8 = 0.625)

The only way I know how to do this is to use a for loop from 1 to k (or in the case of my example, 1 through 6), find the indices of the points for each label, and average their coordinates by indexing a meshgrid of the image.

However, I am looking to do this in a way optimized for GPU computations. Hence, the use of a for loop is less than ideal (takes about 1 sec per image on a nice GPU for a few hundred labels). I am using PyTorch, but really any numpy solution should suffice.

Is there a GPU-efficient solution for this task?

Upvotes: 1

Answers (3)

Divakar

Reputation: 221564

One idea would be to use bincount to accumulate the row and column indices for each region using the numbers in the input array as the bins and thus have a vectorized solution, like so -

m,n = a.shape
r,c = np.mgrid[:m,:n]
count = np.bincount(a.ravel())
centroid_row = np.bincount(a.ravel(),r.ravel())/count
centroid_col = np.bincount(a.ravel(),c.ravel())/count

Sample run -

In [77]: a
Out[77]: 
array([[1, 1, 2, 2, 2, 2, 2, 3, 3],
       [1, 1, 1, 2, 2, 2, 2, 3, 3],
       [1, 1, 2, 2, 2, 2, 3, 3, 3],
       [1, 4, 4, 4, 2, 2, 2, 2, 3],
       [4, 4, 4, 5, 5, 5, 2, 3, 3],
       [4, 4, 4, 5, 5, 6, 6, 6, 6],
       [4, 4, 5, 5, 5, 5, 6, 6, 6]])

In [78]: np.c_[centroid_row, centroid_col]
Out[78]: 
array([[  nan,   nan], 
       [ 1.25,  0.62], # centroid for region-1
       [ 1.56,  4.44], # centroid for region-2
       [ 1.9 ,  7.4 ], # centroid for region-3 and so on.
       [ 4.36,  1.18],
       [ 5.11,  3.67],
       [ 5.43,  6.71]])

Upvotes: 2

sascha

Reputation: 33532

Consider either using scikit-image or reusing their code (based on numpy/scipy).

Here is a demo:

import numpy as np
from skimage import measure
from time import perf_counter as pc

img = np.array([[1, 1, 2, 2, 2, 2, 2, 3, 3],
                [1, 1, 1, 2, 2, 2, 2, 3, 3],
                [1, 1, 2, 2, 2, 2, 3, 3, 3],
                [1, 4, 4, 4, 2, 2, 2, 2, 3],
                [4, 4, 4, 5, 5, 5, 2, 3, 3],
                [4, 4, 4, 5, 5, 6, 6, 6, 6],
                [4, 4, 5, 5, 5, 5, 6, 6, 6]])

# assuming already labels of 1, 2, ... n
times = [pc()]
props = measure.regionprops(img)
times.append(pc())
for i in range(np.unique(img).shape[0]):
    print(props[i].centroid)
    times.append(pc())

print(np.diff(times))

Output:

(1.25, 0.625)
(1.5555555555555556, 4.4444444444444446)
(1.8999999999999999, 7.4000000000000004)
(4.3636363636363633, 1.1818181818181819)
(5.1111111111111107, 3.6666666666666665)
(5.4285714285714288, 6.7142857142857144)
[  9.05569615e-05   8.51235438e-04   2.48126075e-04   2.59294767e-04
   2.42692657e-04   2.00734598e-04   2.34542530e-04]

Upvotes: 2

Cris Luengo

Reputation: 60504

This computation requires accumulation, I don't know how efficient that is on a GPU. This is the sequential algorithm in psuedo-code:

int n[k] = 0
int sx[k] = 0
int sy[k] = 0
loop over y:
   loop over x:
      i = img[x,y]
      ++n[i]
      sx[i] += x
      sy[i] += y
for i = 1 to k
    sx[i] /= n[i]
    sy[i] /= n[i]

And then of course, (sx[i],sy[i]) is the centroid of object i.

It's really fast on a CPU, it's not worth the effort to send the data to the GPU for this, unless it's already there.

Upvotes: 2

Efficiently find centroid of labelled image regions

Answers (3)

Related Questions