user947668
user947668

Reputation: 2728

Get boundaries of overlapping rectangles

I would like to obtain the boundaries of overlapping rectangles. Source image looks like: enter image description here

The target is to get boundaries of rectangles 1-4: enter image description here

So far I have the following code:

import cv2
import numpy as np

img = cv2.imread("template.png")
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
flt = cv2.inRange(hsv,(58,55,218),(59,255,247))
flt = cv2.threshold(flt, 254, 255, cv2.THRESH_BINARY)[1]
flt = cv2.GaussianBlur(flt, (5, 1), 100)

contours, hierarchy = cv2.findContours(flt, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
hierarchy = hierarchy[0]

canvas = img.copy()
for idx, cntr in enumerate(contours):
    if hierarchy[idx][3] != -1:
        arclen = cv2.arcLength(cntr, True)
        approx = cv2.approxPolyDP(cntr, arclen*0.01, True)
        cv2.drawContours(canvas, [approx], -1, (0,0,255), 1, cv2.LINE_AA)

cv2.imshow('template', canvas)
cv2.waitKey(0)
cv2.destroyAllWindows()

Which returns me the following result, it is close but not exactly what I need: enter image description here

I assume that some kind of comparison between each rectangle should be implemented, but i'm stuck here. Any help is appreciated.

Upvotes: 0

Views: 692

Answers (1)

stateMachine
stateMachine

Reputation: 5805

This is a very lengthy post, please, bear with me as I try to explain my ideas in coherent ways. This is my take on the problem. The idea is to reconstruct each rectangle from its corners. We first locate the corner points on the image via the Hit-or-Miss operation. Once we have the corners, we store their coordinate points in a list. We then apply an elaborated algorithm to reconstruct N full rectangles from the points in this list. The steps are:

  1. Threshold the image via Otsu's Thresholding
  2. Apply a series of Hit-or-Miss kernels to obtain the four corners of the binary image
  3. Construct a list of corner points from the corners image
  4. Apply an elaborated algorithm for rectangle reconstruction

Don't worry about the last point. I explain my unoptimized and unvectorized algorithm on the last section. For now, let's get the first part of the whole thing – the corners list. I'm testing this with this first image. This is the code:

# Imports:
import numpy as np
import cv2

# Set the image path
path = "D://opencvImages//"
fileName = "rectangles1.png"

# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)

# Prepare a deep copy for results:
inputImageCopy = inputImage.copy()

# Convert BGR to Grayscale
grayImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Threshold via Otsu:
_, binaryImage = cv2.threshold(grayImage, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

# Get image dimensions
(height, width) = binaryImage.shape

The first part is straightforward. We need a binary image to operate on. We get the following binary image by applying Otsu's Thresholding:

Now, let's apply the Hit-or-Miss operation. This morphological operation identifies groups of pixels that match a given pattern. The pattern is given by a kernel – a 3 x 3 matrix with the target binary pattern. We have four corners – I will apply four different kernels. Check out the docs for more info on the operation and how you specify the target pattern. This is the code for this section:

# Prepare a list with all possible corner kernels:
kernelList = []

# First corner:
kernel = np.array((
        [-1, -1, 0],
        [-1, 1, 0],
        [0, 0, 0]), dtype="int")
# Store it into the list:
kernelList.append(kernel)

# Second corner:
kernel = np.array((
        [-1, -1, 0],
        [1, -1, 0],
        [0, 0, 0]), dtype="int")
# Store it into the list:
kernelList.append(kernel)

# Third corner:
kernel = np.array((
        [-1, 1, 0],
        [-1, -1, 0],
        [0, 0, 0]), dtype="int")
# Store it into the list:
kernelList.append(kernel)

# Fourth corner:
kernel = np.array((
        [1, -1, 0],
        [-1, -1, 0],
        [0, 0, 0]), dtype="int")
# Store it into the list:
kernelList.append(kernel)

Ok. Now, we will run a loop that fetches each kernel and applies it to the binary image. The idea is to get a composite image of all found corners, so we will be building a mask and adding it to itself – this is because at the end of each loop iteration a possible set of corners are found, and we must accumulate this info in our final mask. This can be implemented by simply ORing the mask in series, as it is being generated by the loop:

# Prepare the image that will hold the corners of the image:
cornerMask = np.zeros((height, width, 1), np.uint8)

# Apply all the kernels to the image:
for i in range(len(kernelList)):
    # Get current kernel>
    currentKernel = kernelList[i]
    # Apply it to the binary image:
    hitMiss = cv2.morphologyEx(binaryImage, cv2.MORPH_HITMISS, currentKernel)
    # Accumulate Mask
    cornerMask = cv2.bitwise_or(hitMiss, cornerMask)

We end up with this corner image, where all the white pixels indicate the location of a "kernel match":

The operation also produced a white line at the left side of the image. No worries, let's flood-fill at the top left corner (x=0, y=0) with black color to get rid of this artifact:

# Flood fill at the left top of the image:
cv2.floodFill(cornerMask, mask=None, seedPoint=(int(0),int(0)), newVal=(0))

This is the result:

Ok, let's convert this image to actual coordinates. There's a lot of ways to achieve this. I’m coming from C++ and my NumPy knowledge is still green, so I use the most naïve solution for this: A loop that searches for the white pixels and stores their location in a list. If you know how to vectorize this operation, please share it so I can learn!

My filthy approach has an advantage, though, it raster-scans the image left to right, top to bottom, so I get the corners in a very predictable and known way:

# Loop through the image looking for white pixels
# store their location in the points list:    
for j in range(height):
    for i in range(width):
        # Get current pixel:
        currentPixel = cornerMask[j,i]
        # if white, store the coordinates:
        if currentPixel == 255:
            pointsList.append(list((i,j,0)))
                
# Print list:
print(pointsList)

For the first test, this is the printed contents of the list:

[[56, 46, 0], [207, 46, 0], [179, 123, 0], [312, 123, 0], [56, 362, 0], [207, 362, 0], [179, 427, 0], [312, 427, 0]]

Alright, here comes the "meat" of the algorithm. However, I would like to make some points before explaining it:

  1. There's no optimization – I wanted to make sure the thing works before optimizing it

  2. The algorithm could be refactored into a recursive function

  3. I’ve been working with Python for 6 months; As I mentioned earlier, I'm coming from C++ so there's surely a more Pythonic way of doing this

By all means, feel free to optimize this algorithm, vectorize it, and then share it with us. Anyway. The idea is after you get the corner points in the image that identify the corners, we will try to rebuild each rectangle using this information. Each rectangle will be made of four pairs of points – the corners.

Let's analyze the corner points we have so far. The following table shows the list of points as it was constructed during the raster-scanning procedure to convert corner pixels to points:

The four corners that make up the first rectangle are the blue rows in the points column – those are points {1,2,5,6}. Let's assume some things:

  1. As we scanned the image from right to left, top to bottom, these points are ordered from smallest x, smallest y, to largest x and largest y
  2. The reconstruction seems to follow a known order: from top to bottom. That is, in the list, we first encounter the top leftmost point of the rectangle, followed by the top rightmost point. Then the bottom leftmost point and, finally, the bottom rightmost point. We can exploit this to reconstruct the rectangle
  3. We assume we are processing a non-rotated, right rectangle. Meaning their inner angles are always of 90 degrees.

With that in mind and having a look at the corner table, we note that each point shares at least one coordinate with another point. For example, P1 shares its y coordinate with P2. P5 shares its x coordinate with P1. P5 also shares its y coordinate with P6. P6 is made from the x coordinate of P2 and the y coordinate of P5. This gives us almost a recipe to reconstruct the first rectangle from the table.

Suppose we traverse the list. We fetch the first point, P1. Cool, this gives us the first point, made from an x, y pair. Let’s look for P2. We know it must be laying on the same vertical line. From the table, it was clear that P2 shares its y coordinate with P1. Let’s look for the point that has the same y coordinate then. If we are traversing the table from top to boom, we would be fetching P2 in the next algorithm iteration. So far, we would have P1 and P2.

Let's look for the next point. This would be laying down directly below P1, so it must share its vertical coordinate. Let's look for the next point (remember - top to bottom) that has exactly the same x as P1. We would then find P5 as the next point. Nice, only one remaining. The next point would be sharing its horizontal coordinate with P5 – that would lead us to P6. 4 corners, a new rectangle!

Check out the updated points table:

The common coordinates in the table are joined by colors: The yellow and gray cells are the shared x coordinates among different points. The green and orang-ish cells share y coordinates.

Ok, so, in one iteration we traversed the list for times – each time a corner point was extracted to reconstruct a rectangle. That means that we will be traversing this list 4 x number of rectangles on the image. That's a lot of times. However, once a point has been processed, there's no need to re-process it again. Furthermore, we can avoid duplicates by marking those points we know are part of a rectangle. For this, I use an extra flag to keep track of those corners that have been processed. That processed column marks the points and its filled with a 1 if the corner was assigned to a rectangle. 0 if the point is currently unprocessed.

With all of this in mind, let's see the UNOPTIMIZED code (Please mind that the variable names are not related to the table I posted above):

# Store the final rectangles here:
rectangleList = []

# Get the total points in the points list:
totalPoints = len(pointsList)

# Traverse the points list:
for a in range(totalPoints):

    # tempRect to store all four corners
    # # of a rectangle:
    tempRect = [None]*4

    # Get the current first point - P1:
    p1 = pointsList[a]

    # Get x, y and isProcessed flag from P1:
    p1x = p1[0]
    p1y = p1[1]
    isProcessed = p1[2]

    # Just process it if it hasn't been processed before:
    if isProcessed == 0:

        # Mark processed in list:
        pointsList[a][2] = 1
        # First point goes into temporal struct:
        tempRect[0] = (p1x, p1y)

        # Let's look for the following point:
        for b in range(totalPoints):

            # Get x, y and isProcessed flag from P2:
            p2 = pointsList[b]
            p2y = p2[1]
            isProcessed = p2[2]

            # Just process it if it hasn't been processed before,
            # We are looking for the point that shares its y:
            if isProcessed == 0 and p2y == p1y:

                # Get P2x:
                p2x = p2[0]

                # Mark processed:
                pointsList[b][2] = 1
                # Second point goes into temporal struct:
                tempRect[1] = (p2x, p2y)

                # Let's look for the following point:
                for c in range(totalPoints):

                    # Get x, y and isProcessed flag from P3:
                    p3 = pointsList[c]
                    p3x = p3[0]
                    isProcessed = p3[2]

                    # Just process it if it hasn't been processed before,
                    # We are looking for the point that shares its x with P1:
                    if isProcessed == 0 and p1x == p3x:

                        # Get P2y:
                        p3y = p3[1]

                        # Mark processed:
                        pointsList[c][2] = 1
                        # Third point goes into temporal struct:
                        tempRect[2] = (p3x, p3y)

                        # Let's look for the following point:
                        for d in range(totalPoints):

                            # Get x, y and isProcessed flag from P4:
                            p4 = pointsList[d]
                            p4y = p4[1]
                            isProcessed = p4[2]

                            # Just process it if it hasn't been processed before,
                            # We are looking for the point that shares its y with P3:
                            if isProcessed == 0 and p3y == p4y:

                                # Get P4y:
                                p4x = p4[0]

                                # Mark processed:
                                pointsList[d][2] = 1
                                # Fourth point goes into temporal struct:
                                tempRect[3] = (p4x, p4y)

                                # We now have a full rectangle, store it in the
                                # rectangle list:
                                rectangleList.append(tempRect)

Its very straightforward. Note that we are looping through the points list at a minimum of four times. We have a loop every time we look for a corner point. Each time a point satisfies our search condition we store it in a temporal array. At the end, after 4 points have been processed, we can finally store the temporal array (the full rectangle) in a separated list. Again, there's a lot here that can be optimized.

Let's draw our rectangles. The following bit just loops through the rectangle list, fetches the rectangle and gets the diagonal for drawing. Note that I'm just using two corners of the four found. Shall you require all corners, you have that info available. Another point for further optimization. Let's use a random color to draw the rectangles:

# Finally, draw the rectangles:
for r in range(len(rectangleList)):
    # Get current rectangle:
    currentRect = rectangleList[r]
    # Set rectangle:
    x1 = currentRect[0][0]
    y1 = currentRect[0][1]
    x2 = currentRect[3][0]
    y2 = currentRect[3][1]

    # Set a random BGR color for the rectangle:
    color = (np.random.randint(low=0, high=256), np.random.randint(low=0, high=256), np.random.randint(low=0, high=256))
    # Draw the rectangle:
    cv2.rectangle(inputImageCopy, (int(x1), int(y1)), (int(x2), int(y2)), color, 2)

    cv2.imshow("Rects", inputImageCopy)
    cv2.waitKey(0)

This is an image of two rectangles:

And three rectangles:

Upvotes: 3

Related Questions