OpenCV Finding Correct Threshold to Determine Image Match or Not with Matching Score

I'm currently making a recognition program using various feature extractor and various matcher. Using the score from the matcher, I want to create a score threshold which can further determine if it's a correct match or incorrect one.

I am trying to understand the DMatch distance meaning from various matcher, does smaller distance value a better match? If yes, I am confused because the same image with difference position returns a bigger value than two different images.

I've run two test cases:

Compare one image with the same images with different positions, etc.
Compare one image with totally different images with a few different positions, etc.

Here is, my test result:

-----------------------------------------------

Positive image average distance
Total test number: 70
Comparing with SIFT
     Use BF with Ratio Test: 874.071456255
     Use FLANN             : 516.737270464

Comparing with SURF
     Use BF with Ratio Test: 2.92960552163
     Use FLANN             : 1.47225751158

Comparing with ORB
     Use BF                : 12222.1428571
     Use BF with Ratio Test: 271.638643755

Comparing with BRISK
     Use BF                : 31928.4285714
     Use BF with Ratio Test: 1537.63658578

Maximum positive image distance
Comparing with SIFT
     Use BF with Ratio Test: 2717.88008881
     Use FLANN             : 1775.63563538

Comparing with SURF
     Use BF with Ratio Test: 4.88817568123
     Use FLANN             : 2.81848525628

Comparing with ORB
     Use BF                : 14451.0
     Use BF with Ratio Test: 1174.47851562

Comparing with BRISK
     Use BF                : 41839.0
     Use BF with Ratio Test: 3846.39746094

-----------------------------------------

Negative image average distance
Total test number: 72
Comparing with SIFT
     Use BF with Ratio Test: 750.028228866
     Use FLANN             : 394.982576052

Comparing with SURF
     Use BF with Ratio Test: 2.89866939275
     Use FLANN             : 1.59815886725

Comparing with ORB
     Use BF                : 12098.9444444
     Use BF with Ratio Test: 261.874231339

Comparing with BRISK
     Use BF                : 31165.8472222
     Use BF with Ratio Test: 1140.46670034

Minimum negative image distance
Comparing with SIFT
     Use BF with Ratio Test: 0
     Use FLANN             : 0

Comparing with SURF
     Use BF with Ratio Test: 1.25826786458
     Use FLANN             : 0.316588282585

Comparing with ORB
     Use BF                : 10170.0
     Use BF with Ratio Test: 0

Comparing with BRISK
     Use BF                : 24774.0
     Use BF with Ratio Test: 0

Also in some cases, when two different images being tested with each other and there is no match, the matcher also return 0 scores which exactly the same score when two identical images being compared together.

After further inspections, there are four major cases:

Two identic images, a lot of matches, distance = 0
Two same images (not identic), a lot of matches, distance = big value
Two totally different images, no match, distance = 0
Two different images, a few matches, distance = small value

Finding the correct thresholding value based on these cases seems to be the problem since some cases contradict each other. Usually the more identical the images, the lower the distance value.

matcher.py

def useBruteForce(img1, img2, kp1, kp2, des1, des2, setDraw):
    # create BFMatcher object
    bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)

    # Match descriptors.
    matches = bf.match(des1,des2)

    # Sort them in the order of their distance.
    matches = sorted(matches, key = lambda x:x.distance)

    totalDistance = 0
    for g in matches:
        totalDistance += g.distance

    if setDraw == True:
        # Draw matches.
        img3 = cv2.drawMatches(img1, kp1, img2, kp2, matches, None, flags=2)
        plt.imshow(img3),plt.show()

    return totalDistance


def useBruteForceWithRatioTest(img1, img2, kp1, kp2, des1, des2, setDraw):
    # BFMatcher with default params
    bf = cv2.BFMatcher()
    matches = bf.knnMatch(des1,des2, k=2)

    # Apply ratio test
    good = []
    for m,n in matches:
        if m.distance < 0.75*n.distance:
            good.append(m)

    totalDistance = 0
    for g in good:
        totalDistance += g.distance

    if setDraw == True:
        # cv2.drawMatchesKnn expects list of lists as matches.
        img3 = cv2.drawMatchesKnn(img1,kp1,img2,kp2,[good],None,flags=2)
        plt.imshow(img3),plt.show()

    return totalDistance


def useFLANN(img1, img2, kp1, kp2, des1, des2, setDraw, type):
    # Fast Library for Approximate Nearest Neighbors
    MIN_MATCH_COUNT = 1
    FLANN_INDEX_KDTREE = 0
    FLANN_INDEX_LSH = 6

    if type == True:
        # Detect with ORB
        index_params= dict(algorithm = FLANN_INDEX_LSH,
                       table_number = 6, # 12
                       key_size = 12,     # 20
                       multi_probe_level = 1) #2
    else:
        # Detect with Others such as SURF, SIFT
        index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)

    # It specifies the number of times the trees in the index should be recursively traversed. Higher values gives better precision, but also takes more time
    search_params = dict(checks = 90)

    flann = cv2.FlannBasedMatcher(index_params, search_params)
    matches = flann.knnMatch(des1, des2, k=2)

    # store all the good matches as per Lowe's ratio test.
    good = []
    for m,n in matches:
        if m.distance < 0.7*n.distance:
            good.append(m)

    totalDistance = 0
    for g in good:
        totalDistance += g.distance

    if setDraw == True:
        if len(good)>MIN_MATCH_COUNT:
            src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
            dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)

            M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC,5.0)
            matchesMask = mask.ravel().tolist()

            h,w = img1.shape
            pts = np.float32([ [0,0],[0,h-1],[w-1,h-1],[w-1,0] ]).reshape(-1,1,2)
            dst = cv2.perspectiveTransform(pts,M)

            img2 = cv2.polylines(img2,[np.int32(dst)],True,255,3, cv2.LINE_AA)

        else:
            print "Not enough matches are found - %d/%d" % (len(good),MIN_MATCH_COUNT)
            matchesMask = None

        draw_params = dict(matchColor = (0,255,0), # draw matches in green color
                           singlePointColor = None,
                           matchesMask = matchesMask, # draw only inliers
                           flags = 2)

        img3 = cv2.drawMatches(img1,kp1,img2,kp2,good,None,**draw_params)
        plt.imshow(img3, 'gray'),plt.show()

    return totalDistance

comparator.py

import matcher    

def check(img1, img2, kp1, kp2, des1, des2, matcherType, setDraw, ORB):
    if matcherType == 1:
        return matcher.useBruteForce(img1, img2, kp1, kp2, des1, des2, setDraw)
    elif matcherType == 2:
        return matcher.useBruteForceWithRatioTest(img1, img2, kp1, kp2, des1, des2, setDraw)
    elif matcherType == 3:
        return matcher.useFLANN(img1, img2, kp1, kp2, des1, des2, setDraw, ORB)
    else:
        print "Matcher not chosen correctly, use Brute Force matcher as default"
        return matcher.useBruteForce(img1, img2, kp1, kp2, des1, des2, matcherType, setDraw)


def useORB(filename1, filename2, matcherType, setDraw):
    img1 = cv2.imread(filename1,0) # queryImage
    img2 = cv2.imread(filename2,0) # trainImage

    # Initiate ORB detector
    orb = cv2.ORB_create()

    # find the keypoints and descriptors with ORB
    kp1, des1 = orb.detectAndCompute(img1,None)
    kp2, des2 = orb.detectAndCompute(img2,None)
    ORB = True
    return check(img1, img2, kp1, kp2, des1, des2, matcherType, setDraw, ORB)


def useSIFT(filename1, filename2, matcherType, setDraw):
    img1 = cv2.imread(filename1,0) # queryImage
    img2 = cv2.imread(filename2,0) # trainImage

    # Initiate SIFT detector
    sift = cv2.xfeatures2d.SIFT_create()

    # find the keypoints and descriptors with SIFT
    kp1, des1 = sift.detectAndCompute(img1, None)
    kp2, des2 = sift.detectAndCompute(img2, None)
    ORB = False
    return check(img1, img2, kp1, kp2, des1, des2, matcherType, setDraw, ORB)


def useSURF(filename1, filename2, matcherType, setDraw):
    img1 = cv2.imread(filename1, 0)
    img2 = cv2.imread(filename2, 0)

    # Here I set Hessian Threshold to 400
    surf = cv2.xfeatures2d.SURF_create(400)

    # Find keypoints and descriptors directly
    kp1, des1 = surf.detectAndCompute(img1, None)
    kp2, des2 = surf.detectAndCompute(img2, None)
    ORB = False
    return check(img1, img2, kp1, kp2, des1, des2, matcherType, setDraw, ORB)


def useBRISK(filename1, filename2, matcherType, setDraw):
    img1 = cv2.imread(filename1,0) # queryImage
    img2 = cv2.imread(filename2,0) # trainImage

    # Initiate BRISK detector
    brisk = cv2.BRISK_create()

    # find the keypoints and descriptors with BRISK
    kp1, des1 = brisk.detectAndCompute(img1,None)
    kp2, des2 = brisk.detectAndCompute(img2,None)
    ORB = True
    return check(img1, img2, kp1, kp2, des1, des2, matcherType, setDraw, ORB)

Upvotes: 8

Answers (2)

Ayham Saffar

Reputation: 1

According to the Open CV Docs, better matches should give lower distances:

DMatch.distance - Distance between descriptors. The lower, the better it is.

Instead of using a distance threshold to determine if two images are a true match, I just checked that the top matches gave consistent transformations.

In my case i was working with microscope images, so i was only interested in the x and y translation between each pair of matched images. Something like this worked quite well for me.

best_matches = sorted(matches, key=lambda x:x.distance)[:5]
x_offsets, y_offsets = [], []
for match in best_matches:
    key_point_1 = image_1.key_points[match.queryIdx]
    key_point_2 = image_2.key_points[match.trainIdx]
    x_offsets.append(key_point_1.pt[0] - key_point_2.pt[0])
    y_offsets.append(key_point_1.pt[1] - key_point_2.pt[1])

tolerance = 1 #a 1 pixel shift is acceptable
is_true_match = np.std(x_offsets) < tolerance and np.std(y_offsets) < tolerance

This has the benefit of not needing to qualify what match distance is good enough.

However this becomes a bit more tricky when the transformation is more complicated I.E. when scaling and rotation and perspective changes are expected. You would need multiple groups of matches to find multiple transformation matrices and come up with a reasonable way to measure how similar the matrices are.

You also risk not having enough image overlap to give you the number of true matches you need. This can be mitigated by using a matcher like SIFT that detects more keypoints at the expense of being slightly slower.

Upvotes: 0

Ali Eren Çelik

Reputation: 269

In OpenCV's tutorial, it is said that

For BF matcher, first we have to create the BFMatcher object using cv.BFMatcher(). It takes two optional params. First one is normType. It specifies the distance measurement to be used. By default, it is cv.NORM_L2. It is good for SIFT, SURF etc (cv.NORM_L1 is also there). For binary string based descriptors like ORB, BRIEF, BRISK etc, cv.NORM_HAMMING should be used, which used Hamming distance as measurement. If ORB is using WTA_K == 3 or 4, cv.NORM_HAMMING2 should be used.

https://docs.opencv.org/3.4/dc/dc3/tutorial_py_matcher.html

So you should create different matcher objects for SIFT and ORB (you got the idea). That might be the reason why distances you calculated differ that much.