Reputation: 7
I am currently working on extracting binary silhouette of people in an image in the publicly available UT interaction dataset. I have applied felzenszwalb's segmentation on the image to divide the image into segments according to the intensity and spatial locations. Then I have to somehow extract binary silhouette of the people in the image. The problem is that I am unable to extract the binary silhouettes for the people in the images properly. Here is the original image:
Here is the segmented image after applying felzenszwalb's segmentation with a parameter value of scale = 200 and sigma = 0.8 where each color in the image represents a segment.
Here is the binary image after applying some threshold to extract binary silhouette where I put some values to exclude segments that have intensity less than a threshold and have height, width and area higher than a threshold:
Finally, here is the binary image after applying morphological erosion with iteration 1:
Now as you can see, those segments have become part of the binary silhouettes as well that do not belong to any of the two people. Even after morphological erosion operations, these segments do not get removed properly. I tried morphological erosion operations with different iterations and different parameter values for the segmentation algorithm, where it can other either prioritize creating large segments ignoring small changes among pixels or creating small segments where it prioritizes even the smallest changes depending on the parameter value. The segmentation and binary silhouette extraction works well for some images, but I have to change the parameters in the segmentation algorithm, threshold values in binary silhouette extraction and iterations for morphological erosion for many images. I want to automate this process, but I just cannot find some method that works for all images in the dataset properly. What can I do?
Here is the code:
import numpy as np
import cv2
import skimage.segmentation as seg
from skimage.color import rgb2gray
from skimage.measure import regionprops
def morphological_erosion(binary_silhouette):
# Create a structuring element (consider different shapes)
kernel = np.ones((3, 3), np.uint8) # Basic 3x3 square
binary_silhouette = cv2.erode(binary_silhouette, kernel, iterations= 1)
return binary_silhouette
def felzenszwalb_segmentation(rgb_image):
# Perform segmentation using Felzenszwalb's method
segments = seg.felzenszwalb(rgb_image, scale=200, sigma=0.8, min_size = 50)
# Generate random colors for each segment
num_segments = len(np.unique(segments))
colors = np.random.randint(0, 256, size=(num_segments, 3), dtype=np.uint8)
# Create a blank RGB image of the same size as the input image
segmented_image = np.zeros_like(rgb_image)
# Assign random colors to each segment
for segment_id in np.unique(segments):
mask = (segments == segment_id)
segmented_image[mask] = colors[segment_id]
# display the segmented image
cv2.imshow("segmented Image", segmented_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
return segments
def create_binary_silhoutte(image_segments, image):
# Convert the image to grayscale
gray_image = rgb2gray(image)
# Create an empty binary image
binary_image = np.zeros_like(gray_image)
# Calculate the mean intensity of each segment
for region in regionprops(image_segments, intensity_image=gray_image):
# Get the mean intensity of the segment
mean_intensity = region.mean_intensity
area = region.area
# Get bounding box coordinates
min_row, min_col, max_row, max_col = region.bbox
# Calculate height and width
height = max_row - min_row
width = max_col - min_col
# If the mean intensity is greater than the threshold, mark the region in the binary image
if mean_intensity > 0.5 and area < 50000 and height < 500 and width < 500:
for coordinates in region.coords:
binary_image[coordinates[0], coordinates[1]] = 255
return binary_image
if __name__ == "__main__":
# # Example usage
image_path = # path to image
image = cv2.imread(image_path)
# Apply Felzenszwalb segmentation to get the silhouette
segments = felzenszwalb_segmentation(image)
binary_silhouette = create_binary_silhoutte(segments, image)
cv2.imshow("binary Silhouette", binary_silhouette)
cv2.waitKey(0)
cv2.destroyAllWindows()
binary_silhouette = morphological_erosion(binary_silhouette)
cv2.imshow("binary Silhouette", binary_silhouette)
cv2.waitKey(0)
cv2.destroyAllWindows()
I also tried using some noise removal algorithms like bilateral filter before applying the segmentation, but it did not work. Any help would be greatly appreciated.
Upvotes: -1
Views: 125
Reputation: 1419
I'd used Felzenszwalb's segmentation with a parameter value of scale=650, sigma=3, min_size=1000
, this will give the following result:
Then you can iterate over segmented labels and use morphological opening and clear border to extract the silhouettes.
Here is the full code:
import cv2
import matplotlib.pyplot as plt
import numpy as np
from skimage import segmentation, morphology
image_path = # path to image
image = cv2.imread(image_path)
# Apply Felzenszwalb segmentation to get the silhouette
segments_fz = segmentation.felzenszwalb(image, scale=650, sigma=3, min_size=1000)
# Create an empty binary image
binary_silhouettes = np.zeros_like(segments_fz)
for l in np.unique(segments_fz):
# Disconnect objects
silhouette = morphology.opening(segments_fz==l, morphology.diamond(3))
# Clear objects connected to the image border
silhouette = segmentation.clear_border(silhouette)
# Remove small objects, i.e. objects with area<1000 will be removed
silhouette = morphology.area_opening(silhouette, 1000)
# Aggregate silhouette masks
binary_silhouettes += silhouette
plt.imshow(binary_silhouettes)
The final result would be this:
or if we apply the mask to original image, like this:
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
mask_3d = np.repeat(binary_silhouettes[:,:,np.newaxis], 3, axis=-1)
plt.imshow(mask_3d * image_rgb)
we will have:
For more general approach you can try to use SAM model, something like this (please run this on GPU):
from transformers.utils import logging
from transformers import pipeline
import requests
from PIL import Image
import matplotlib.pyplot as plt
import torch
logging.set_verbosity_error()
device = "cuda:0" if torch.cuda.is_available() else "cpu"
sam_pipe = pipeline("mask-generation",
"Zigeng/SlimSAM-uniform-77", device=device)
image_url = "https://i.sstatic.net/nSG3oqmP.png"
img_data = requests.get(image_url).content
with open('image.png', 'wb') as handler:
handler.write(img_data)
img = Image.open("image.png")
output = sam_pipe(img, points_per_batch=32)
fig, axes = plt.subplots(1, 3, figsize=(20, 20))
axes[0].imshow(~output['masks'][output['scores'].argmax()])
axes[1].imshow(output['masks'][11])
axes[2].imshow(output['masks'][16])
axes[0].axis("off")
axes[1].axis("off")
axes[2].axis("off")
plt.show()
For faster executions you can identify points where potential silhouettes can appear, and give them to be segmented.
Upvotes: 0