Hamdan Azhar
Hamdan Azhar

Reputation: 7

Problem in extracting human silhouettes from a segmented image (felzenszwalb)

I am currently working on extracting binary silhouette of people in an image in the publicly available UT interaction dataset. I have applied felzenszwalb's segmentation on the image to divide the image into segments according to the intensity and spatial locations. Then I have to somehow extract binary silhouette of the people in the image. The problem is that I am unable to extract the binary silhouettes for the people in the images properly. Here is the original image:

original image

Here is the segmented image after applying felzenszwalb's segmentation with a parameter value of scale = 200 and sigma = 0.8 where each color in the image represents a segment.

segmented image

Here is the binary image after applying some threshold to extract binary silhouette where I put some values to exclude segments that have intensity less than a threshold and have height, width and area higher than a threshold:

binary image after applying threshold

Finally, here is the binary image after applying morphological erosion with iteration 1:

Final result

Now as you can see, those segments have become part of the binary silhouettes as well that do not belong to any of the two people. Even after morphological erosion operations, these segments do not get removed properly. I tried morphological erosion operations with different iterations and different parameter values for the segmentation algorithm, where it can other either prioritize creating large segments ignoring small changes among pixels or creating small segments where it prioritizes even the smallest changes depending on the parameter value. The segmentation and binary silhouette extraction works well for some images, but I have to change the parameters in the segmentation algorithm, threshold values in binary silhouette extraction and iterations for morphological erosion for many images. I want to automate this process, but I just cannot find some method that works for all images in the dataset properly. What can I do?

Here is the code:

import numpy as np
import cv2
import skimage.segmentation as seg
from skimage.color import rgb2gray
from skimage.measure import regionprops


def morphological_erosion(binary_silhouette):

    # Create a structuring element (consider different shapes)
    kernel = np.ones((3, 3), np.uint8)  # Basic 3x3 square
    
    binary_silhouette = cv2.erode(binary_silhouette, kernel, iterations= 1)

    return binary_silhouette

def felzenszwalb_segmentation(rgb_image):
 
    # Perform segmentation using Felzenszwalb's method
    segments = seg.felzenszwalb(rgb_image, scale=200, sigma=0.8, min_size = 50)

    # Generate random colors for each segment
    num_segments = len(np.unique(segments))
    colors = np.random.randint(0, 256, size=(num_segments, 3), dtype=np.uint8)

    # Create a blank RGB image of the same size as the input image
    segmented_image = np.zeros_like(rgb_image)
    
    # Assign random colors to each segment
    for segment_id in np.unique(segments):
        mask = (segments == segment_id)
        segmented_image[mask] = colors[segment_id]

    # display the segmented image
    cv2.imshow("segmented Image", segmented_image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

    return segments

def create_binary_silhoutte(image_segments, image):
    
    # Convert the image to grayscale
    gray_image = rgb2gray(image)

    # Create an empty binary image
    binary_image = np.zeros_like(gray_image)
   
    # Calculate the mean intensity of each segment
    for region in regionprops(image_segments, intensity_image=gray_image):
        # Get the mean intensity of the segment
        mean_intensity = region.mean_intensity
        area = region.area
         # Get bounding box coordinates
        min_row, min_col, max_row, max_col = region.bbox
        
        # Calculate height and width
        height = max_row - min_row
        width = max_col - min_col

        # If the mean intensity is greater than the threshold, mark the region in the binary image
        if mean_intensity > 0.5 and area < 50000 and height < 500 and width < 500:  
            for coordinates in region.coords:
                binary_image[coordinates[0], coordinates[1]] = 255
    

    return binary_image

if __name__ == "__main__":
    # # Example usage
    image_path =  # path to image
    image = cv2.imread(image_path)

    # Apply Felzenszwalb segmentation to get the silhouette
    segments = felzenszwalb_segmentation(image)
    
    binary_silhouette = create_binary_silhoutte(segments, image)

    cv2.imshow("binary Silhouette", binary_silhouette)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

    binary_silhouette = morphological_erosion(binary_silhouette)

    cv2.imshow("binary Silhouette", binary_silhouette)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

I also tried using some noise removal algorithms like bilateral filter before applying the segmentation, but it did not work. Any help would be greatly appreciated.

Upvotes: -1

Views: 125

Answers (1)

Vardan Grigoryants
Vardan Grigoryants

Reputation: 1419

I'd used Felzenszwalb's segmentation with a parameter value of scale=650, sigma=3, min_size=1000, this will give the following result:

Segmented Image

Then you can iterate over segmented labels and use morphological opening and clear border to extract the silhouettes.

Here is the full code:

import cv2
import matplotlib.pyplot as plt
import numpy as np
from skimage import segmentation, morphology

image_path = # path to image
image = cv2.imread(image_path)

# Apply Felzenszwalb segmentation to get the silhouette
segments_fz = segmentation.felzenszwalb(image, scale=650, sigma=3, min_size=1000)

# Create an empty binary image
binary_silhouettes = np.zeros_like(segments_fz)

for l in np.unique(segments_fz):
    # Disconnect objects
    silhouette = morphology.opening(segments_fz==l, morphology.diamond(3))

    # Clear objects connected to the image border
    silhouette = segmentation.clear_border(silhouette)
    
    # Remove small objects, i.e. objects with area<1000 will be removed
    silhouette = morphology.area_opening(silhouette, 1000)

    # Aggregate silhouette masks
    binary_silhouettes += silhouette

plt.imshow(binary_silhouettes)

The final result would be this:

Binary Silhouettes

or if we apply the mask to original image, like this:

image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
mask_3d = np.repeat(binary_silhouettes[:,:,np.newaxis], 3, axis=-1)
plt.imshow(mask_3d * image_rgb)

we will have:

Masked Silhouettes

For more general approach you can try to use SAM model, something like this (please run this on GPU):

from transformers.utils import logging
from transformers import pipeline
import requests
from PIL import Image
import matplotlib.pyplot as plt
import torch
logging.set_verbosity_error()

device = "cuda:0" if torch.cuda.is_available() else "cpu"

sam_pipe = pipeline("mask-generation",
    "Zigeng/SlimSAM-uniform-77", device=device)

image_url = "https://i.sstatic.net/nSG3oqmP.png"
img_data = requests.get(image_url).content
with open('image.png', 'wb') as handler:
    handler.write(img_data)
img = Image.open("image.png")

output = sam_pipe(img, points_per_batch=32)

fig, axes = plt.subplots(1, 3, figsize=(20, 20))
axes[0].imshow(~output['masks'][output['scores'].argmax()])
axes[1].imshow(output['masks'][11])
axes[2].imshow(output['masks'][16])
axes[0].axis("off")
axes[1].axis("off")
axes[2].axis("off")
plt.show()

enter image description here

For faster executions you can identify points where potential silhouettes can appear, and give them to be segmented.

Upvotes: 0

Related Questions