Reputation: 539
I want to remove image background with Open CV in Android. Code is working fine but output quality not as per expectation. I followed java documentation for code reference:
https://opencv-java-tutorials.readthedocs.io/en/latest/07-image-segmentation.html
Thanks
My code snippet in Android:
private fun doBackgroundRemoval(frame: Mat): Mat? {
// init
val hsvImg = Mat()
val hsvPlanes: List<Mat> = ArrayList()
val thresholdImg = Mat()
var thresh_type = Imgproc.THRESH_BINARY_INV
thresh_type = Imgproc.THRESH_BINARY
// threshold the image with the average hue value
hsvImg.create(frame.size(), CvType.CV_8U)
Imgproc.cvtColor(frame, hsvImg, Imgproc.COLOR_BGR2HSV)
Core.split(hsvImg, hsvPlanes)
// get the average hue value of the image
val threshValue: Double = getHistAverage(hsvImg, hsvPlanes[0])
threshold(hsvPlanes[0], thresholdImg, threshValue, 78.0, thresh_type)
Imgproc.blur(thresholdImg, thresholdImg, Size(1.toDouble(), 1.toDouble()))
val kernel1 =
Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, Size(11.toDouble(), 11.toDouble()))
val kernel2 = Mat.ones(3, 3, CvType.CV_8U)
// dilate to fill gaps, erode to smooth edges
Imgproc.dilate(thresholdImg, thresholdImg, kernel1, Point(-1.toDouble(), -1.toDouble()), 1)
Imgproc.erode(thresholdImg, thresholdImg, kernel2, Point(-1.toDouble(), -1.toDouble()), 7)
threshold(thresholdImg, thresholdImg, threshValue, 255.0, Imgproc.THRESH_BINARY_INV)
// create the new image
val foreground = Mat(
frame.size(), CvType.CV_8UC3, Scalar(
255.toDouble(),
255.toDouble(),
255.toDouble()
)
)
frame.copyTo(foreground, thresholdImg)
val img_bitmap =
Bitmap.createBitmap(foreground!!.cols(), foreground!!.rows(), Bitmap.Config.ARGB_8888)
Utils.matToBitmap(foreground!!, img_bitmap)
imageView.setImageBitmap(img_bitmap)
return foreground
}
Upvotes: 3
Views: 4542
Reputation: 27567
import cv2
import numpy as np
def process(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_canny = cv2.Canny(img_gray, 10, 20)
kernel = np.ones((13, 13))
img_dilate = cv2.dilate(img_canny, kernel, iterations=1)
return cv2.erode(img_dilate, kernel, iterations=1)
def get_mask(img):
contours, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
blank = np.zeros(img.shape[:2]).astype('uint8')
for cnt in contours:
if cv2.contourArea(cnt) > 500:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, peri * 0.004, True)
cv2.drawContours(blank, [approx], -1, 255, -1)
return blank
img = cv2.imread("crystal.jpg")
img_masked = cv2.bitwise_and(img, img, mask=get_mask(img))
cv2.imshow("Masked", img_masked)
cv2.waitKey(0)
import cv2
import numpy as np
def process(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_canny = cv2.Canny(img_gray, 10, 20)
kernel = np.ones((13, 13))
img_dilate = cv2.dilate(img_canny, kernel, iterations=1)
return cv2.erode(img_dilate, kernel, iterations=1)
400
to filter out noise) filled in onto the blank image. I also approximated the contours to smoothen things out a bit:def get_mask(img):
contours, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
blank = np.zeros(img.shape[:2]).astype('uint8')
for cnt in contours:
if cv2.contourArea(cnt) > 500:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, peri * 0.004, True)
cv2.drawContours(blank, [approx], -1, 255, -1)
return blank
cv2.bitwise_and
method, along with the get_mask
function we defined, which uses the process
function we defined. Show the masked image in the end:img = cv2.imread("crystal.jpg")
img_masked = cv2.bitwise_and(img, img, mask=get_mask(img))
cv2.imshow("Masked", img_masked)
cv2.waitKey(0)
Instead of the cv2.bitwise_and
method, you can use the cv2.merge
method:
img = cv2.imread("crystal.jpg")
img_masked = cv2.merge(cv2.split(img) + [get_mask(img)])
cv2.imwrite("masked_crystal.png", img_masked)
Resulting image (screenshot):
Explanation:
cv2
module and the numpy
module as np
. We also defined a process
function and a get_mask
function, we can read in the image:img = cv2.imread("crystal.jpg")
cv2.split
method takes in an image array and returns a list of every individual channel present in the image. In our case, we only have 3 channels, and in order to make the image transparent, we need a forth channel: the alpha channel. The cv2.merge
method does the opposite of cv2.split
; it takes in a list of individual channels and returns an image array with the channels. So next we get the bgr channels of the image in a list, and concatenate the mask of the image as the alpha channel:img_masked = cv2.merge(cv2.split(img) + [get_mask(img)])
cv2.imwrite("masked_crystal.png", img_masked)
Here are some more example of the cv2.merge
method: Python cv2.merge() Examples
Upvotes: 4
Reputation: 5805
The task, as you have seen, is not trivial at all. OpenCV has a segmentation algorithm called "GrabCut" that tries to solve this particular problem. The algorithm is pretty good at classifying background and foreground pixels, however it needs very specific information to work. It can operate on two modes:
1st Mode (Mask Mode): Using a Binary Mask (same size as the original input) where 100% definite background pixels are marked, as well as 100% definite foreground pixels. You don't have to mark every pixel on the image, just a region where you are sure the algorithm will find either class of pixels.
2nd Mode (Foreground ROI): Using a bounding box that encloses 100% definite foreground pixels.
Now, I use the notation "100% definitive" to label those pixels you are 100% sure they correspond to either the background of foreground. The algorithm classifies the pixels in four possible classes: "Definite Background", "Probable Background", "Definite Foreground" and "Probable Foreground". It will predict both Probable Background and Probable Foreground pixels, but it needs a priori information of where to find at least "Definitive Foreground" pixels.
With that said, we can use GrabCut
in its 2nd mode (Rectangle ROI) to try an segment the input image . We can try and get a first, rough, binary mask of the input. This will mark where we are sure the algorithm can find foreground pixels. We will feed this rough mask to the algorithm and check out the results. Now, the method is not easy and its automation not straightforward, there's some manual information we will set that work particularly well for this input image. I don't know the Java implementation of OpenCV, so I'm giving you the solution for Python. Hopefully you will be able to port it. This is the general outline of the algorithm:
This is the code:
# imports:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "backgroundTest.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# (Optional) Deep copy for results:
inputImageCopy = inputImage.copy()
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Adaptive Thresholding
windowSize = 31
windowConstant = 11
binaryImage = cv2.adaptiveThreshold(grayscaleImage, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, windowSize, windowConstant)
The first step is to get the rough foreground mask using Adaptive Thresholding. Here, I've use the ADAPTIVE_THRESH_MEAN_C
method, where the (local) threshold value is the mean of a neighborhood area on the input image. This yields the following image:
It's pretty rough, right? We can clean this up a little bit using some morphology. I use a Closing
with a rectangular kernel
of size 3 x 3
and 10
iterations to join the big blobs of white pixels. I've wrapped the OpenCV functions inside custom functions that save me the typing of some lines. These helper functions are presented at the end of this post. For now, this step is as follows:
# Apply a morphological closing with:
# Rectangular SE size 3 x 3 and 10 iterations
binaryImage = morphoOperation(binaryImage, 3, 10, "Closing")
This is the rough mask after filtering:
A little bit better. Ok, we can now search for the bounding box of the biggest contour. A search for the outer contours via cv2.RETR_EXTERNAL
will suffice for this example, as we can safely ignore children contours, like this:
# Find the EXTERNAL contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# This list will store the target bounding box
maskRect = []
Additionally, let's get a list
ready where we will store the target bounding rectangle. Let's now search on the detected contours. I've also implemented an area filter in case some noise is present, so the pixels below a certain area threshold are ignored:
# Look for the outer bounding boxes (no children):
for i, c in enumerate(contours):
# Get blob area:
currentArea = cv2.contourArea(c)
# Get the bounding rectangle:
boundRect = cv2.boundingRect(c)
# Set a minimum area
minArea = 1000
# Look for the target contour:
if currentArea > minArea:
# Found the target bounding rectangle:
maskRect = boundRect
# (Optional) Draw the rectangle on the input image:
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# (Optional) Set color and draw:
color = (0, 0, 255)
cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )
# (Optional) Show image:
cv2.imshow("Bounding Rectangle", inputImageCopy)
cv2.waitKey(0)
Optionally you can draw the bounding box found by the algorithm. This is the resulting image:
It is looking good. Note that some obvious background pixels are also enclosed by the ROI
. GrabCut
will try to re-classify these pixels into their proper class, i.e., "Definitive Background". Alright, let's prepare the data for GrabCut
:
# Create mask for Grab n Cut,
# The mask is a uint8 type, same dimensions as
# original input:
mask = np.zeros(inputImage.shape[:2], np.uint8)
# Grab n Cut needs two empty matrices of
# Float type (64 bits) and size 1 (rows) x 65 (columns):
bgModel = np.zeros((1, 65), np.float64)
fgModel = np.zeros((1, 65), np.float64)
We need to prepare three matrices/numpy arrays/whatever data type is used to represent images in Java. The first is where the segmentation mask
obtained by GrabCut
will be stored. This mask will have values from 0
to 3
to denote the class of each pixel on the original input. The bgModel
and fgModel
matrices are used internally by the algorithm to store the statistical model of the foreground and background. Be aware that both of these matrices are float
matrices. Lastly, GrabCut
is an iterative algorithm. It will run for n
iterations. Ok, Let's run GrabCut
:
# Run Grab n Cut on INIT_WITH_RECT mode:
grabCutIterations = 5
mask, bgModel, fgModel = cv2.grabCut(inputImage, mask, maskRect, bgModel, fgModel, grabCutIterations, mode=cv2.GC_INIT_WITH_RECT)
Ok, the classification is done. You can try and convert mask
to an (image) visible type to check out the labels of each pixel. This is optional, but should you wish to do so, you'd get 4
matrices. Each one for each class. For example, for the "Definitive Background" class, GrabCut
found these are the pixels belonging to such class (in white):
The pixels belonging to the "Probable Background" class are these:
That's pretty good, huh? Here are the pixels belonging to the "Probable Foreground" class:
Very nice. Let's create the final segmentation mask, because mask
is not an image, it is just an array
containing labels for each pixel. We will use the Definite Background and Probable Background pixels to set the final mask, we then can "normalize" the data range and convert it to uint8
to obtain an actual image
# Set all definite background (0) and probable background pixels (2)
# to 0 while definite foreground and probable foreground pixels are
# set to 1
outputMask = np.where((mask == cv2.GC_BGD) | (mask == cv2.GC_PR_BGD), 0, 1)
# Scale the mask from the range [0, 1] to [0, 255]
outputMask = (outputMask * 255).astype("uint8")
This is the actual segmentation mask:
Alright, we can clean a little bit this image, because there are some small holes produced by misclassifying foreground pixels as background pixels. Let's apply just another morphological closing
, this time using 5
iterations:
# (Optional) Apply a morphological closing with:
# Rectangular SE size 3 x 3 and 5 iterations:
outputMask = morphoOperation(outputMask, 3, 5, "Closing")
Finally, use this outputMask
in an AND
with the original image to produce the final segmented result:
# Apply a bitwise AND to the image using our mask generated by
# GrabCut to generate the final output image:
segmentedImage = cv2.bitwise_and(inputImage, inputImage, mask=outputMask)
cv2.imshow("Segmented Image", segmentedImage)
cv2.waitKey(0)
This is the final result:
If you need transparency on this image, is very straightforward to use outputMask
as alpha channel
. This is the helper function I used earlier:
# Applies a morpho operation:
def morphoOperation(binaryImage, kernelSize, opIterations, opString):
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform Operation:
if opString == "Closing":
op = cv2.MORPH_CLOSE
else:
print("Morpho Operation not defined!")
return None
outImage = cv2.morphologyEx(binaryImage, op, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
return outImage
Upvotes: 6