kulvinder
kulvinder

Reputation: 539

Background removal from images with OpenCV in Android

I want to remove image background with Open CV in Android. Code is working fine but output quality not as per expectation. I followed java documentation for code reference:

https://opencv-java-tutorials.readthedocs.io/en/latest/07-image-segmentation.html

Thanks

original Image Original image

My output Result Expected output Expected Output

My code snippet in Android:

private fun doBackgroundRemoval(frame: Mat): Mat? {
    // init
    val hsvImg = Mat()
    val hsvPlanes: List<Mat> = ArrayList()
    val thresholdImg = Mat()
    var thresh_type = Imgproc.THRESH_BINARY_INV
    thresh_type = Imgproc.THRESH_BINARY

    // threshold the image with the average hue value
    hsvImg.create(frame.size(), CvType.CV_8U)
    Imgproc.cvtColor(frame, hsvImg, Imgproc.COLOR_BGR2HSV)
    Core.split(hsvImg, hsvPlanes)

    // get the average hue value of the image
    val threshValue: Double = getHistAverage(hsvImg, hsvPlanes[0])
    threshold(hsvPlanes[0], thresholdImg, threshValue, 78.0, thresh_type)
    Imgproc.blur(thresholdImg, thresholdImg, Size(1.toDouble(), 1.toDouble()))

    val kernel1 =
        Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, Size(11.toDouble(), 11.toDouble()))
    val kernel2 = Mat.ones(3, 3, CvType.CV_8U)
    // dilate to fill gaps, erode to smooth edges
    Imgproc.dilate(thresholdImg, thresholdImg, kernel1, Point(-1.toDouble(), -1.toDouble()), 1)
    Imgproc.erode(thresholdImg, thresholdImg, kernel2, Point(-1.toDouble(), -1.toDouble()), 7)
    threshold(thresholdImg, thresholdImg, threshValue, 255.0, Imgproc.THRESH_BINARY_INV)

    // create the new image
    val foreground = Mat(
        frame.size(), CvType.CV_8UC3, Scalar(
            255.toDouble(),
            255.toDouble(),
            255.toDouble()
        )
    )
    frame.copyTo(foreground, thresholdImg)
    val img_bitmap =
        Bitmap.createBitmap(foreground!!.cols(), foreground!!.rows(), Bitmap.Config.ARGB_8888)
    Utils.matToBitmap(foreground!!, img_bitmap)
    imageView.setImageBitmap(img_bitmap)

    return foreground
}

Upvotes: 3

Views: 4542

Answers (2)

Red
Red

Reputation: 27567

The Code

import cv2
import numpy as np

def process(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_canny = cv2.Canny(img_gray, 10, 20)
    kernel = np.ones((13, 13))
    img_dilate = cv2.dilate(img_canny, kernel, iterations=1)
    return cv2.erode(img_dilate, kernel, iterations=1)
    
def get_mask(img):
    contours, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
    blank = np.zeros(img.shape[:2]).astype('uint8')
    for cnt in contours:
        if cv2.contourArea(cnt) > 500:
            peri = cv2.arcLength(cnt, True)
            approx = cv2.approxPolyDP(cnt, peri * 0.004, True)
            cv2.drawContours(blank, [approx], -1, 255, -1) 
    return blank

img = cv2.imread("crystal.jpg")
img_masked = cv2.bitwise_and(img, img, mask=get_mask(img))

cv2.imshow("Masked", img_masked)
cv2.waitKey(0)

The Output

enter image description here

The Explanation

  1. Import the necessary libraries:
import cv2
import numpy as np
  1. Define a function to process an image to be fit for proper contour detection. In the function, first convert the image to grayscale, and then detect its edges using the canny edge detector. With the edges detected, we can dilate and erode them once to give the edges more body:
def process(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_canny = cv2.Canny(img_gray, 10, 20)
    kernel = np.ones((13, 13))
    img_dilate = cv2.dilate(img_canny, kernel, iterations=1)
    return cv2.erode(img_dilate, kernel, iterations=1)
  1. Define a function to generate a mask for the image. After finding the contours of the image, define a grayscale blank image with the shape of the image, and draw every contour (of area greater than 400 to filter out noise) filled in onto the blank image. I also approximated the contours to smoothen things out a bit:
def get_mask(img):
    contours, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
    blank = np.zeros(img.shape[:2]).astype('uint8')
    for cnt in contours:
        if cv2.contourArea(cnt) > 500:
            peri = cv2.arcLength(cnt, True)
            approx = cv2.approxPolyDP(cnt, peri * 0.004, True)
            cv2.drawContours(blank, [approx], -1, 255, -1) 
    return blank
  1. Finally, read in the image, and mask the image using the cv2.bitwise_and method, along with the get_mask function we defined, which uses the process function we defined. Show the masked image in the end:
img = cv2.imread("crystal.jpg")
img_masked = cv2.bitwise_and(img, img, mask=get_mask(img))

cv2.imshow("Masked", img_masked)
cv2.waitKey(0)

Transparent Background

Instead of the cv2.bitwise_and method, you can use the cv2.merge method:

img = cv2.imread("crystal.jpg")
img_masked = cv2.merge(cv2.split(img) + [get_mask(img)])
cv2.imwrite("masked_crystal.png", img_masked)

Resulting image (screenshot):

enter image description here

Explanation:

  1. Keeping in mind we already imported the cv2 module and the numpy module as np. We also defined a process function and a get_mask function, we can read in the image:
img = cv2.imread("crystal.jpg")
  1. The cv2.split method takes in an image array and returns a list of every individual channel present in the image. In our case, we only have 3 channels, and in order to make the image transparent, we need a forth channel: the alpha channel. The cv2.merge method does the opposite of cv2.split; it takes in a list of individual channels and returns an image array with the channels. So next we get the bgr channels of the image in a list, and concatenate the mask of the image as the alpha channel:
img_masked = cv2.merge(cv2.split(img) + [get_mask(img)])
  1. Lastly we can write the four channel image into a file:
cv2.imwrite("masked_crystal.png", img_masked)

Here are some more example of the cv2.merge method: Python cv2.merge() Examples

Upvotes: 4

stateMachine
stateMachine

Reputation: 5805

The task, as you have seen, is not trivial at all. OpenCV has a segmentation algorithm called "GrabCut" that tries to solve this particular problem. The algorithm is pretty good at classifying background and foreground pixels, however it needs very specific information to work. It can operate on two modes:

  • 1st Mode (Mask Mode): Using a Binary Mask (same size as the original input) where 100% definite background pixels are marked, as well as 100% definite foreground pixels. You don't have to mark every pixel on the image, just a region where you are sure the algorithm will find either class of pixels.

  • 2nd Mode (Foreground ROI): Using a bounding box that encloses 100% definite foreground pixels.

Now, I use the notation "100% definitive" to label those pixels you are 100% sure they correspond to either the background of foreground. The algorithm classifies the pixels in four possible classes: "Definite Background", "Probable Background", "Definite Foreground" and "Probable Foreground". It will predict both Probable Background and Probable Foreground pixels, but it needs a priori information of where to find at least "Definitive Foreground" pixels.

With that said, we can use GrabCut in its 2nd mode (Rectangle ROI) to try an segment the input image . We can try and get a first, rough, binary mask of the input. This will mark where we are sure the algorithm can find foreground pixels. We will feed this rough mask to the algorithm and check out the results. Now, the method is not easy and its automation not straightforward, there's some manual information we will set that work particularly well for this input image. I don't know the Java implementation of OpenCV, so I'm giving you the solution for Python. Hopefully you will be able to port it. This is the general outline of the algorithm:

  1. Get a first rough mask of the foreground object via thresholding
  2. Detect contours on the rough mask to retrieve a bounding rectangle
  3. The bounding rectangle will serve as input ROI for the GrabCut algorithm
  4. Set the parameters needed for the GrabCut algorithm
  5. Clean the segmentation mask obtained by GrabCut
  6. Use the segmentation mask to finally segment the foreground object

This is the code:

# imports:
import cv2
import numpy as np

# image path
path = "D://opencvImages//"
fileName = "backgroundTest.png"

# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)

# (Optional) Deep copy for results:
inputImageCopy = inputImage.copy()

# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Adaptive Thresholding
windowSize = 31
windowConstant = 11
binaryImage = cv2.adaptiveThreshold(grayscaleImage, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, windowSize, windowConstant)

The first step is to get the rough foreground mask using Adaptive Thresholding. Here, I've use the ADAPTIVE_THRESH_MEAN_C method, where the (local) threshold value is the mean of a neighborhood area on the input image. This yields the following image:

It's pretty rough, right? We can clean this up a little bit using some morphology. I use a Closing with a rectangular kernel of size 3 x 3 and 10 iterations to join the big blobs of white pixels. I've wrapped the OpenCV functions inside custom functions that save me the typing of some lines. These helper functions are presented at the end of this post. For now, this step is as follows:

# Apply a morphological closing with:
# Rectangular SE size 3 x 3 and 10 iterations
binaryImage = morphoOperation(binaryImage, 3, 10, "Closing")

This is the rough mask after filtering:

A little bit better. Ok, we can now search for the bounding box of the biggest contour. A search for the outer contours via cv2.RETR_EXTERNAL will suffice for this example, as we can safely ignore children contours, like this:

# Find the EXTERNAL contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# This list will store the target bounding box
maskRect = []

Additionally, let's get a list ready where we will store the target bounding rectangle. Let's now search on the detected contours. I've also implemented an area filter in case some noise is present, so the pixels below a certain area threshold are ignored:

# Look for the outer bounding boxes (no children):
for i, c in enumerate(contours):

    # Get blob area:
    currentArea = cv2.contourArea(c)

    # Get the bounding rectangle:
    boundRect = cv2.boundingRect(c)

    # Set a minimum area
    minArea = 1000

    # Look for the target contour:
    if currentArea > minArea:

        # Found the target bounding rectangle:
        maskRect = boundRect

        # (Optional) Draw the rectangle on the input image:
        # Get the dimensions of the bounding rect:
        rectX = boundRect[0]
        rectY = boundRect[1]
        rectWidth = boundRect[2]
        rectHeight = boundRect[3]

        # (Optional) Set color and draw:
        color = (0, 0, 255)
        cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
                    (int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )
        
        # (Optional) Show image:
        cv2.imshow("Bounding Rectangle", inputImageCopy)
        cv2.waitKey(0)

Optionally you can draw the bounding box found by the algorithm. This is the resulting image:

It is looking good. Note that some obvious background pixels are also enclosed by the ROI. GrabCut will try to re-classify these pixels into their proper class, i.e., "Definitive Background". Alright, let's prepare the data for GrabCut:

# Create mask for Grab n Cut,
# The mask is a uint8 type, same dimensions as
# original input:
mask = np.zeros(inputImage.shape[:2], np.uint8)

# Grab n Cut needs two empty matrices of
# Float type (64 bits) and size 1 (rows) x 65 (columns):
bgModel = np.zeros((1, 65), np.float64)
fgModel = np.zeros((1, 65), np.float64)

We need to prepare three matrices/numpy arrays/whatever data type is used to represent images in Java. The first is where the segmentation mask obtained by GrabCut will be stored. This mask will have values from 0 to 3 to denote the class of each pixel on the original input. The bgModel and fgModel matrices are used internally by the algorithm to store the statistical model of the foreground and background. Be aware that both of these matrices are float matrices. Lastly, GrabCut is an iterative algorithm. It will run for n iterations. Ok, Let's run GrabCut:

# Run Grab n Cut on INIT_WITH_RECT mode:
grabCutIterations = 5
mask, bgModel, fgModel = cv2.grabCut(inputImage, mask, maskRect, bgModel, fgModel, grabCutIterations, mode=cv2.GC_INIT_WITH_RECT)

Ok, the classification is done. You can try and convert mask to an (image) visible type to check out the labels of each pixel. This is optional, but should you wish to do so, you'd get 4 matrices. Each one for each class. For example, for the "Definitive Background" class, GrabCut found these are the pixels belonging to such class (in white):

The pixels belonging to the "Probable Background" class are these:

That's pretty good, huh? Here are the pixels belonging to the "Probable Foreground" class:

Very nice. Let's create the final segmentation mask, because mask is not an image, it is just an array containing labels for each pixel. We will use the Definite Background and Probable Background pixels to set the final mask, we then can "normalize" the data range and convert it to uint8 to obtain an actual image

# Set all definite background (0) and probable background pixels (2)
# to 0 while definite foreground and probable foreground pixels are
# set to 1
outputMask = np.where((mask == cv2.GC_BGD) | (mask == cv2.GC_PR_BGD), 0, 1)

# Scale the mask from the range [0, 1] to [0, 255]
outputMask = (outputMask * 255).astype("uint8")

This is the actual segmentation mask:

Alright, we can clean a little bit this image, because there are some small holes produced by misclassifying foreground pixels as background pixels. Let's apply just another morphological closing, this time using 5 iterations:

# (Optional) Apply a morphological closing with:
# Rectangular SE size 3 x 3 and 5 iterations:
outputMask = morphoOperation(outputMask, 3, 5, "Closing")

Finally, use this outputMask in an AND with the original image to produce the final segmented result:

# Apply a bitwise AND to the image using our mask generated by
# GrabCut to generate the final output image:
segmentedImage = cv2.bitwise_and(inputImage, inputImage, mask=outputMask)

cv2.imshow("Segmented Image", segmentedImage)
cv2.waitKey(0)

This is the final result:

If you need transparency on this image, is very straightforward to use outputMask as alpha channel. This is the helper function I used earlier:

# Applies a morpho operation:
def morphoOperation(binaryImage, kernelSize, opIterations, opString):
    # Get the structuring element:
    morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
    # Perform Operation:
    if opString == "Closing":
        op = cv2.MORPH_CLOSE
    else:
        print("Morpho Operation not defined!")
        return None

    outImage = cv2.morphologyEx(binaryImage, op, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)

    return outImage

Upvotes: 6

Related Questions