Maoristirn
Maoristirn

Reputation: 51

Find area in image with python and opencv

I would like to find an area in about 1,5K images which are all in a similar format. They all are scans of painted or photographed images of persons. They all feature the same color card. The color cards may be placed on either side of the image (see sample image below).

The result should be an image, only containing the person's portrait.

I am able to find the color card with opencv template matching:

import cv2
import numpy as np

method = cv2.TM_SQDIFF_NORMED

# Read the images from the file
img_rgb = cv2.imread('./imgs/test_portrait.jpg')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)

template = cv2.imread('./portraet_color_card.png', 0)
w, h = template.shape[::-1]

result = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)

threshold = .97
loc = np.where(result >= threshold)
for pt in zip(*loc[::-1]):
   print("Found:", pt)
   cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,255,255), 2)

cv2.imwrite('result.png',img_rgb)

Output:

Found: (17, 303)
Found: (18, 303)
Found: (17, 304)
Found: (18, 304)

With the coordinates and the image dimensions, I am able to determine if the image is left or right and can crop the image. The result is far from perfect, as the borders still are there.

Is there a better way to extract the portraits from the images? I would prefer to work with python and opencv but I am open to other suggestions on how to solve this problem for a larger number of images.

Samples: Sample image 1 Sample image 2 Sample image 3 Sample image 4

Template: Template

Upvotes: 2

Views: 3293

Answers (2)

Bilal
Bilal

Reputation: 3864

This solution assumes that The portrait is the largest pattern in the image

1 2 3

Solution Steps in order:

Classical Image processing to obtain the important features from the image:

  • Conversion to Gray level.
  • Gaussian Blur to reduce noise and smooth the image.
  • Edge Detection, using Canny in my case.
  • Morphological Dilation to group the features into two main patterns.
  • Largest Connected components Detection (credit to an old SO answer)
  • The rest is to mask the largest connected component.

Note that this solution has some assumptions, hence generalization might not always work!, but I have tested this solution with the given images.

#!/usr/bin/python3
# -*- coding: utf-8 -*-

import cv2
import numpy as np

class ImgProcessor:
    def __init__(self, path, imName):
        self.path = path
        self.imName = imName
        self.original = cv2.imread(self.path+self.imName)

    def imProcess(self, ksmooth=7, kdilate=3, thlow=50, thigh= 100):
        # Read Image in BGR format
        img_bgr = self.original.copy()
        # Convert Image to Gray
        img_gray= cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY)
        # Gaussian Filtering for Noise Removal
        gauss = cv2.GaussianBlur(img_gray, (ksmooth, ksmooth), 0)
        # Canny Edge Detection
        edges = cv2.Canny(gauss, thlow, thigh, 10)
        # Morphological Dilation
        # TODO: experiment diferent kernels
        kernel = np.ones((kdilate, kdilate), 'uint8')
        dil = cv2.dilate(edges, kernel)

        return dil
    
    def largestCC(self, imBW):
        # Extract Largest Connected Component
        # Source: https://stackoverflow.com/a/47057324
        image = imBW.astype('uint8')
        nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(image, connectivity=4)
        sizes = stats[:, -1]

        max_label = 1
        max_size = sizes[1]
        for i in range(2, nb_components):
            if sizes[i] > max_size:
                max_label = i
                max_size = sizes[i]

        img2 = np.zeros(output.shape)
        img2[output == max_label] = 255
        return img2
    
    def maskCorners(self, mask, outval=1):
        y0 = np.min(np.nonzero(mask.sum(axis=1))[0])
        y1 = np.max(np.nonzero(mask.sum(axis=1))[0])
        x0 = np.min(np.nonzero(mask.sum(axis=0))[0])
        x1 = np.max(np.nonzero(mask.sum(axis=0))[0])
        output = np.zeros_like(mask)
        output[y0:y1, x0:x1] = outval
        return output

    def extractROI(self):
        im = self.imProcess()
        lgcc = self.largestCC(im)
        lgcc = lgcc.astype(np.uint8)
        roi = self.maskCorners(lgcc)
        # TODO mask BGR with this mask
        exroi = cv2.bitwise_and(self.original, self.original, mask = roi)
        return exroi

    def show_res(self):
        result = self.extractROI()
        cv2.namedWindow("Result", cv2.WINDOW_NORMAL)
        cv2.imshow("Result", result)
        cv2.waitKey(0)

# ==============================================
if __name__ == "__main__":
    # TODO: change the path, and image name to suit your needs
    impr_ = ImgProcessor(path="/home/", imName="img.png")
    res = impr_.show_res()

Upvotes: 2

Renat Gilmanov
Renat Gilmanov

Reputation: 17895

First of all, let's pretend you have at least 15K images, so there is a need to spend valuable time automating this (1,5K can be processed manually). I'll try to define a high-level approach and provide some PoC results (sorry, no code, I use a custom CV tool/pipeline).

As you mentioned background color of the card varies, so let's play safe: color cards contain some specific colors. I'll use them as an initial "key". Colors are unique, so I can define the proper threshold in order to make my results stable:

enter image description here

Two segmented cells provide us with a pretty simple validation approach (compare dimensions, relative location, etc). At this point we can easily find color card background (it is better to do multiple measurements near identified color cells):

enter image description here

As you can see some noise, lossy compression artefacts affect the result, but it is still good enough (yet another validation possibility: compare cell and card size). At this point, we can do additional measurements in order to find colors of the background.

Let's review simple cases first: results seem to be good enough, so final crop and small correctness can be easily implemented:

enter image description here

enter image description here

enter image description here

Some cases will not be that straightforward:

enter image description here

I'd suggest investing more time in validation rules and processing all tricky cases manually, but with some additional time "common trickiness" can be also addressed.

Anyway, here is a brief summary:

  1. use key colors to reliably identify the color card (and do initial validation)
  2. do multiple measurements to find color card background (so you can use a smaller threshold)
  3. do multiple measurements to define image background
  4. validate strategy is a must, so it will be easier to process some small amount of leftovers manually enter image description here

PS: white on white is fun, but Kazimir Malevich did that quite a while ago, no need to repeat :)

Upvotes: 0

Related Questions