Best way to detect GUI buttons using a image file as reference in python/OpenCV

Question

I'm challenging myself to automate somethings playing a game called Pokemon TCG Online.

As I don't know nothing about reverse engineering, I'm trying to use Computer Vision to identify objects and perform tasks.

The GUI of the game is always the same, so I dont have to deal with color variance and other things. My first tought was to use template matching, but, I'm having a problem with false positives.

The other two alternatives I found was using a HAAR Cascade (I found a "bot" of other game that uses it) or using a neural network and train it to recognize every element.

Before I go deep in a way to do it, I would like to find the best way, to avoid time wasting on a non functional way. Also, I don't want to "use a sledgehammer to crack a nut", so I'm looking for a simple and elegant way to do it.

My first aproach was using python and opencv, since both are simple to use, but I'm open to other tools. I know how to use YOLO on python, but I only succeed installing it on Linux and the game runs on Windows.

Thank you very much

The code I'm using:

import cv2
import pyautogui
from PIL import ImageGrab

fourcc = cv2.VideoWriter_fourcc('X','V','I','D') #you can use other codecs as well.
vid = cv2.VideoWriter('record.avi', fourcc, 8, (1440,900))
jogar = cv2.imread("jogar.png",  0)

while(True):
    
    
    img = ImageGrab.grab(bbox=(0, 0, 1000, 1000)) #x, y, w, h
    img_np = np.array(img)
    img_npGray = cv2.cvtColor(img_np, cv2.COLOR_BGR2GRAY)
    #frame = cv2.cvtColor(img_np, cv2.COLOR_BGR2GRAY)
    vid.write(img_np)
    cv2.imshow("frame", img_npGray)
    res = cv2.matchTemplate(img_npGray, jogar, cv2.TM_SQDIFF)
    threshold  = 0.9
    loc = np.where (res >= threshold)
    # pyautogui.moveTo(loc)
    print(loc)
    
    
    key = cv2.waitKey(1)
    if key == 27:
        break    

vid.release()
cv2.destroyAllWindows()

bfris · Accepted Answer

I said the tutorials in the official docs were great in my comment. And they are. But you have do some searching for the sample images. Many of them are here including the Messi picture used for the template matching tutorial.

This code works. If you are using TM_SQDIFF, then the best match will be found as a minimum. Also, you probably want the best match using cv2.minMaxLoc, rather than using a threshold.

import cv2
import numpy as np

screenshot = cv2.imread("screenshot.png", 0)
template = cv2.imread("template.png",  0)

res = cv2.matchTemplate(screenshot, template, cv2.TM_SQDIFF)

# threshold  = 0.1
# loc = np.where (res >= threshold)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)

print(min_loc)

which gives

(389, 412)

Screenshot:

Template

Best way to detect GUI buttons using a image file as reference in python/OpenCV

Answers (2)

Related Questions