Piyush Shandilya
Piyush Shandilya

Reputation: 75

Using Keypoint feature matching + Homography to straighten document (Aadhaar)

Hi I'm trying to create an OCR where the model should be able to read an uploaded document. However, lot of times, the documents uploaded are skewed or tilted. I plan to straighten and/or resize the document based on a template.

To achieve this, I intend to use feature mapping and homography. However, whenever I calculate my keypoints and descriptors (using ORB), and try to match them using Brute Force Matching, none of the features seem to match. Here's the code that I've used so far and the results with it. Can someone point me in the right direction if I'm missing something or doing it in a certain incorrect way?

def straighten_image(ORIG_IMG, IMG2):
    # read both the images:
    orig_image = cv2.imread(ORIG_IMG)
    img_input = cv2.imread(IMG2)
    
    orig_gray_scale = cv2.cvtColor(orig_image, cv2.COLOR_BGR2GRAY)
    gray_scale_img = cv2.cvtColor(img_input, cv2.COLOR_BGR2GRAY)
    
    #Detect ORB features and compute descriptors
    MAX_NUM_FEATURES = 100
    orb = cv2.ORB_create(MAX_NUM_FEATURES)
    keypoints1, descriptors1 = orb.detectAndCompute(orig_gray_scale, None)
    keypoints2, descriptors2= orb.detectAndCompute(gray_scale_img, None)
    
    #display image with keypoints
    orig_wid_decriptors = cv2.drawKeypoints(orig_gray_scale, keypoints1, outImage = np.array([]), color= (255, 0, 0), flags= cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
    inp_wid_decriptors = cv2.drawKeypoints(img_input, keypoints2, outImage = np.array([]), color= (255, 0, 0), flags= cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

    #Match features
    
    matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING)
    matches = matcher.match(descriptors1, descriptors2, None)
    
    print(type(matches))
    
    #sort matches
#     matches.sort(key=lambda x: x.distance, reverse=False)
    
    
    #Remove not-so-good matches
    numGoodMatches = int(len(matches)*0.1)
    matches = matches[:numGoodMatches]
    
    #Draw Top matches
    im_matches = cv2.drawMatches(orig_gray_scale, keypoints1, gray_scale_img, keypoints2, matches, None)
    
    cv2.imshow("", im_matches)
    cv2.waitKey(0)
    
    #Homography
    points1 = np.zeros((len(matches), 2), dtype = np.float32)
    points2 = np.zeros((len(matches), 2), dtype = np.float32)
    
    for i, match in enumerate(matches):
        points1[i, :] = keypoints1[match.queryIdx].pt
        points2[i, :] = keypoints2[match.trainIdx].pt
        
    #Find homography:
    h, mask = cv2.findHomography(points2, points1, cv2.RANSAC)
    
    #Warp image
    # Use homography to warp image
    height, width = orig_gray_scale.shape
    inp_reg = cv2.warpPerspective(gray_scale_img, h, (width, height), borderValue = 255)
    
    return inp_reg


import cv2
import matplotlib.pyplot as plt
import numpy as np
template = "template_aadhaar.jpg"
test = "test.jpeg"

str_img = straighten_image(template, test)

cv2.imshow("", str_img)
cv2.waitKey(0)

This is the template image

and the test image that needs to be straightened

Matched features

EDIT: If I use my own ID-card (perfectly straight) as the template and try to align the same ID-card that is tilted, it matches the features and re-aligns the tilted image perfectly. However, I need the model to be able to re-align any other ID-card based on the template. By any ID, I mean the details could be different but the location and font would be exactly the same.

EDIT#2: As suggested by @Olli, I tried using a template with only those features that are same for all Aadhaar cards. Image attached. But still the feature matching is a bit arbitrary.

Template with changing values removed

Upvotes: 0

Views: 846

Answers (1)

Olli
Olli

Reputation: 303

Feature mapping tries to detect the most significant features on an image and tries to match them. This only works if the features really are the same. If the features are similar but different, it will fail.

If you have some features that are always the same (e.g. the logo on the top left), you could try to create a template with only these features and blank in all other areas, i.e. remove the person and the name and the QR code and...

But because there are more differences ("Government of India inside the green area on image and above on the other,...) than similarities, I would try to find the rotation based on the corners and/or the edges of the shape. For example:

  • convert to grayscale
  • perform canny edge detection
  • detect corners, e.g. using cv2.goodFeaturesToTrack. If some corners are hidden, try finding the sides using Hough lines instead.
  • undistort

If some images are rotated 90, 180 or 270 degrees after undistortion, you could use a filter to find the orange and green areas and rotate so that this area is at the top again.

Upvotes: 1

Related Questions