Reputation: 3854
I have a binary image. The binary image has some isolate regions like noise. I know that the expected region much larger than these isolate regions. Hence, I used the connected components to remove the isolate regions by finding the largest connected region. I have to use scipy package. I found it has some functions to do it. However, I still am in far away the result. How can I use the functions to obtain a binary image that can ignore the isolated region? Thanks
from scipy import ndimage
label_im, nb_labels = ndimage.label(binary_img)
# Find the largest connected component
sizes = ndimage.sum(binary_img, label_im, range(nb_labels + 1))
mask_size = sizes < 1000
remove_pixel = mask_size[label_im]
label_im[remove_pixel] = 0
labels = np.unique(label_im)
binary_img= np.searchsorted(labels, label_im)
#Select the biggest connected component
binary_img[binary_img < binary_img.max()]=0
binary_img[binary_img >= binary_img.max()]=1
Upvotes: 3
Views: 3533
Reputation: 60444
You have a good start, using ndimage.sum
to find the sizes of each labeled region.
From there, you can use sizes
(or something derived from it) as a lookup table:
from scipy import ndimage
label_im, nb_labels = ndimage.label(binary_img)
sizes = ndimage.sum(binary_img, label_im, range(nb_labels + 1))
mask = sizes > 1000
binary_img = mask[label_im]
This creates a lookup table mask
that is true
for the indices that correspond to the labels for the larger regions, and false
elsewhere. Indexing into the lookup table using the labeled image yields the desired binary image.
Note that sizes[label_im]
is an image where each region is painted with its size. That is, every pixel in region #1 gets the value of the size of region #1. You can threshold this image to remove small regions:
size_img = sizes[label_im]
binary_img = size_img > 1000
(These two lines are equivalent to the last two lines of the previous code snippet.)
Upvotes: 8