Jonathan
Jonathan

Reputation: 21

python - OpenCV false positives

I'm working with OpenCV in python and am getting an absurd amount of false positives when I turn my threshold down, but when I turn it up I no longer get the image I'm looking for or anything. I have to turn it down to 0.4 to get anything. Does anyone have any ideas? Below is the screenshot I took, the template image I'm looking for in the screenshot, and the result.

screen = (0, 0, 1920, 1080)
ImageGrab.grab(screen).save("screenshots/screenshot.jpg")
time.sleep(2)

# Read the main image
img_rgb = cv2.imread('screenshots/screenshot.jpg')

# Convert to grayscale
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)

# Read the template
template = cv2.imread('monsters/knight.jpg', 0)

# Store width and height of template in w and h
w, h = template.shape[::-1]

res = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshold = 0.4
loc = np.where(res >= threshold)

for pt in zip(*loc[::-1]):
    cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0, 255, 255), 2)

cv2.imshow('Detected', img_rgb)
cv2.waitKey(0)

False positives

'knight.jpg'

'screenshots/screenshot.jpg'

Upvotes: 0

Views: 679

Answers (1)

Ollin Boer Bohan
Ollin Boer Bohan

Reputation: 2401

Your template is at a different scale from your search space.

Comparison of template and search image

Since matchTemplate is only checking that single scale, you won't get good detections. You need to either correct the scale or search at a variety of scales.

Here's some (quick) code that will search at a variety of scales:

overall_score = np.zeros_like(img_gray)
# search scales 1.0, 1.1, 1.2...
scales = np.arange(1.0, 2.0, 0.1)
for scale in scales:
    # resize the template to that scale
    t_w = int(w * scale)
    t_h = int(h * scale)
    scaled_template = cv2.resize(template, (t_w, t_h))
    res = cv2.matchTemplate(img_gray, scaled_template, cv2.TM_CCOEFF_NORMED)
    # pad the results so that we can combine them across each scale
    res = cv2.copyMakeBorder(
        res, t_h // 2, (t_h - 1) // 2, t_w // 2, (t_w - 1) // 2, cv2.BORDER_CONSTANT
    )
    # combine the results
    overall_score = np.maximum(res, overall_score)
# we can use a much higher threshold
threshold = 0.9
loc = np.where(overall_score >= threshold)

# since we padded the images, coordinates are centers rather than top-left
for pt in zip(*loc[::-1]):
    cv2.rectangle(
        img_rgb,
        (pt[0] - w // 2, pt[1] - h // 2),
        (pt[0] + w // 2, pt[1] + h // 2),
        (0, 255, 255),
        2,
    )
cv2.imwrite("detections.png", img_rgb)

Using this code gives the expected result:

enter image description here

Upvotes: 2

Related Questions