GalacticSquirrel
GalacticSquirrel

Reputation: 39

How to detect if an image is in another image?

I am trying to detect if an image is a 100 % match to being in another image, and then set a variable to True if it is. But all the things I have read have turned up little to no results apart from one particular thread in which this code is given.

import cv2

method = cv2.TM_SQDIFF_NORMED

# Read the images from the file
small_image = cv2.imread('ran_away.png')
large_image = cv2.imread('pokemon_card.png')

result = cv2.matchTemplate(small_image, large_image, method)

# We want the minimum squared difference
mn,_,mnLoc,_ = cv2.minMaxLoc(result)

# Draw the rectangle:
# Extract the coordinates of our best match
MPx,MPy = mnLoc

# Step 2: Get the size of the template. This is the same size as the match.
trows,tcols = small_image.shape[:2]

# Step 3: Draw the rectangle on large_image
cv2.rectangle(large_image, (MPx,MPy),(MPx+tcols,MPy+trows),(0,0,255),2)

# Display the original image with the rectangle around the match.
cv2.imshow('output',large_image)

# The image is only displayed if we call this
cv2.waitKey(0)

However this opens up an output, and does stuff I do not want to do. All I want to do is to detect, if an image is in an image, and if it is, then print that to the console. In my particular circumstances, I am trying to detect if this image

Ran_away.png

is in this image

Pokemon_card.png

and if it is, print to the console that the pokemon has ran away.

Upvotes: 3

Views: 7439

Answers (2)

Rotem
Rotem

Reputation: 32084

I found a solution using the relatively new NumPy method sliding_window_view.

Create a sliding window view into the array with the given window shape.

Also known as rolling or moving window, the window slides across all dimensions of the array and extracts subsets of the array at all window positions.

New in version 1.20.0.

Note: I have installed the latest NumPy version in a new virtual environment due to compatibility issues concerns.

Simple test for checking how sliding_window_view works:

import numpy as np
from numpy.lib.stride_tricks import sliding_window_view

t = np.array([[ [0,0,0], [1,1,1]],
              [ [2,2,2], [3,3,3]]])

x = np.array([[ [0,0,0],  [1,1,1],  [2,2,2],  [3,3,3]],
              [[10,10,10], [11,11,11], [12,12,12], [13,13,13]],
              [[20,20,20], [21,21,21], [22,22,22], [23,23,23]]])

x[1:3, 1:3, :] = t  # Copy t to x - it looks like there is a problem along edges

v = sliding_window_view(x, (2,2,3))

print(v-t)

Result starts with:

[[[[[[ 0  0  0]
     [ 0  0  0]]

That means that t is subtracted from all "windows" of v as expected.


Add the following command for testing np.all:

print(np.where((v == t).all(axis=(3, 4, 5))))

The output is:

(array([1], dtype=int64), array([1], dtype=int64), array([0], dtype=int64))

all(axis=(3, 4, 5)) is True if all elements along axis 3, 4 and 5 are True.
In the above example we found a match in index [1, 1].


Here is a solution for detecting a perfect match (using NumPy):

import cv2
import numpy as np
from numpy.lib.stride_tricks import sliding_window_view

# Read the images from the file
small_image = cv2.imread('ran_away.png')
#small_image = cv2.imread('icon.png');
large_image = cv2.imread('pokemon_card.png')

v = sliding_window_view(large_image, small_image.shape)

match_idx = np.where((v == small_image).all(axis=(3, 4, 5)))

if len(match_idx[0]) > 0:
    row = match_idx[0][0]
    col = match_idx[1][0]

    cv2.rectangle(large_image, (col, row), (col+small_image.shape[1], row+small_image.shape[1]), (0, 255, 0), 2)

    cv2.imshow('large_image', large_image)
    cv2.waitKey()
    cv2.destroyAllWindows()

Result:
enter image description here

Upvotes: 3

HansHirse
HansHirse

Reputation: 18895

Your code shows basic template matching. Please work through some tutorial on that topic, and the documentation on cv2.matchTemplate, especially to understand the different template match modes.

I can only think of the following solution to approach your task: Instead of using TM_SQDIFF_NORMED, use TM_SQDIFF, such that you get absolute values in the result instead of relative values:

  • For TM_SQDIFF_NORMED, the best match will always be some value near 0.0, even if the match isn't correct.
  • For TM_SQDIFF, some value near 0.0 indicates an actual correct match.

So, now, simply write a method, that does the template matching, and detects, if the minimum value of result is below some threshold near 0.0, let's say 10e-6. If so, print out whatever you want, if not, do something else:

import cv2


def is_template_in_image(img, templ):

    # Template matching using TM_SQDIFF: Perfect match => minimum value around 0.0
    result = cv2.matchTemplate(img, templ, cv2.TM_SQDIFF)

    # Get value of best match, i.e. the minimum value
    min_val = cv2.minMaxLoc(result)[0]

    # Set up threshold for a "sufficient" match
    thr = 10e-6

    return min_val <= thr


# Read template
template = cv2.imread('ran_away.png')

# Collect image file names
images = ['pokemon_card.png', 'some_other_image.png']

for image in images:
    if is_template_in_image(cv2.imread(image), template):
        print('{}: {}'.format(image, 'Pokemon has ran away.'))
    else:
        print('{}: {}'.format(image, 'Nothing to see here.'))

The output:

pokemon_card.png: Pokemon has ran away.
some_other_image.png: Nothing to see here.
----------------------------------------
System information
----------------------------------------
Platform:      Windows-10-10.0.19041-SP0
Python:        3.9.1
PyCharm:       2021.1.1
OpenCV:        4.5.2
----------------------------------------

Upvotes: 5

Related Questions