Reputation: 51
I would like to find an area in about 1,5K images which are all in a similar format. They all are scans of painted or photographed images of persons. They all feature the same color card. The color cards may be placed on either side of the image (see sample image below).
The result should be an image, only containing the person's portrait.
I am able to find the color card with opencv template matching:
import cv2
import numpy as np
method = cv2.TM_SQDIFF_NORMED
# Read the images from the file
img_rgb = cv2.imread('./imgs/test_portrait.jpg')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('./portraet_color_card.png', 0)
w, h = template.shape[::-1]
result = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshold = .97
loc = np.where(result >= threshold)
for pt in zip(*loc[::-1]):
print("Found:", pt)
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,255,255), 2)
cv2.imwrite('result.png',img_rgb)
Output:
Found: (17, 303)
Found: (18, 303)
Found: (17, 304)
Found: (18, 304)
With the coordinates and the image dimensions, I am able to determine if the image is left or right and can crop the image. The result is far from perfect, as the borders still are there.
Is there a better way to extract the portraits from the images? I would prefer to work with python and opencv but I am open to other suggestions on how to solve this problem for a larger number of images.
Upvotes: 2
Views: 3293
Reputation: 3864
This solution assumes that The portrait is the largest pattern in the image
Classical Image processing to obtain the important features from the image:
Canny
in my case.Note that this solution has some assumptions, hence generalization might not always work!, but I have tested this solution with the given images.
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import cv2
import numpy as np
class ImgProcessor:
def __init__(self, path, imName):
self.path = path
self.imName = imName
self.original = cv2.imread(self.path+self.imName)
def imProcess(self, ksmooth=7, kdilate=3, thlow=50, thigh= 100):
# Read Image in BGR format
img_bgr = self.original.copy()
# Convert Image to Gray
img_gray= cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)
# Gaussian Filtering for Noise Removal
gauss = cv2.GaussianBlur(img_gray, (ksmooth, ksmooth), 0)
# Canny Edge Detection
edges = cv2.Canny(gauss, thlow, thigh, 10)
# Morphological Dilation
# TODO: experiment diferent kernels
kernel = np.ones((kdilate, kdilate), 'uint8')
dil = cv2.dilate(edges, kernel)
return dil
def largestCC(self, imBW):
# Extract Largest Connected Component
# Source: https://stackoverflow.com/a/47057324
image = imBW.astype('uint8')
nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(image, connectivity=4)
sizes = stats[:, -1]
max_label = 1
max_size = sizes[1]
for i in range(2, nb_components):
if sizes[i] > max_size:
max_label = i
max_size = sizes[i]
img2 = np.zeros(output.shape)
img2[output == max_label] = 255
return img2
def maskCorners(self, mask, outval=1):
y0 = np.min(np.nonzero(mask.sum(axis=1))[0])
y1 = np.max(np.nonzero(mask.sum(axis=1))[0])
x0 = np.min(np.nonzero(mask.sum(axis=0))[0])
x1 = np.max(np.nonzero(mask.sum(axis=0))[0])
output = np.zeros_like(mask)
output[y0:y1, x0:x1] = outval
return output
def extractROI(self):
im = self.imProcess()
lgcc = self.largestCC(im)
lgcc = lgcc.astype(np.uint8)
roi = self.maskCorners(lgcc)
# TODO mask BGR with this mask
exroi = cv2.bitwise_and(self.original, self.original, mask = roi)
return exroi
def show_res(self):
result = self.extractROI()
cv2.namedWindow("Result", cv2.WINDOW_NORMAL)
cv2.imshow("Result", result)
cv2.waitKey(0)
# ==============================================
if __name__ == "__main__":
# TODO: change the path, and image name to suit your needs
impr_ = ImgProcessor(path="/home/", imName="img.png")
res = impr_.show_res()
Upvotes: 2
Reputation: 17895
First of all, let's pretend you have at least 15K images, so there is a need to spend valuable time automating this (1,5K can be processed manually). I'll try to define a high-level approach and provide some PoC results (sorry, no code, I use a custom CV tool/pipeline).
As you mentioned background color of the card varies, so let's play safe: color cards contain some specific colors. I'll use them as an initial "key". Colors are unique, so I can define the proper threshold in order to make my results stable:
Two segmented cells provide us with a pretty simple validation approach (compare dimensions, relative location, etc). At this point we can easily find color card background (it is better to do multiple measurements near identified color cells):
As you can see some noise, lossy compression artefacts affect the result, but it is still good enough (yet another validation possibility: compare cell and card size). At this point, we can do additional measurements in order to find colors of the background.
Let's review simple cases first: results seem to be good enough, so final crop and small correctness can be easily implemented:
Some cases will not be that straightforward:
I'd suggest investing more time in validation rules and processing all tricky cases manually, but with some additional time "common trickiness" can be also addressed.
Anyway, here is a brief summary:
PS: white on white is fun, but Kazimir Malevich did that quite a while ago, no need to repeat :)
Upvotes: 0