Luiz
Luiz

Reputation: 140

Best way to detect GUI buttons using a image file as reference in python/OpenCV

I'm challenging myself to automate somethings playing a game called Pokemon TCG Online.

As I don't know nothing about reverse engineering, I'm trying to use Computer Vision to identify objects and perform tasks.

The GUI of the game is always the same, so I dont have to deal with color variance and other things. My first tought was to use template matching, but, I'm having a problem with false positives.

The other two alternatives I found was using a HAAR Cascade (I found a "bot" of other game that uses it) or using a neural network and train it to recognize every element.

Before I go deep in a way to do it, I would like to find the best way, to avoid time wasting on a non functional way. Also, I don't want to "use a sledgehammer to crack a nut", so I'm looking for a simple and elegant way to do it.

My first aproach was using python and opencv, since both are simple to use, but I'm open to other tools. I know how to use YOLO on python, but I only succeed installing it on Linux and the game runs on Windows.

Thank you very much

The code I'm using:

import cv2
import pyautogui
from PIL import ImageGrab

fourcc = cv2.VideoWriter_fourcc('X','V','I','D') #you can use other codecs as well.
vid = cv2.VideoWriter('record.avi', fourcc, 8, (1440,900))
jogar = cv2.imread("jogar.png",  0)

while(True):
    
    
    img = ImageGrab.grab(bbox=(0, 0, 1000, 1000)) #x, y, w, h
    img_np = np.array(img)
    img_npGray = cv2.cvtColor(img_np, cv2.COLOR_BGR2GRAY)
    #frame = cv2.cvtColor(img_np, cv2.COLOR_BGR2GRAY)
    vid.write(img_np)
    cv2.imshow("frame", img_npGray)
    res = cv2.matchTemplate(img_npGray, jogar, cv2.TM_SQDIFF)
    threshold  = 0.9
    loc = np.where (res >= threshold)
    # pyautogui.moveTo(loc)
    print(loc)
    
    
    key = cv2.waitKey(1)
    if key == 27:
        break    

vid.release()
cv2.destroyAllWindows()

enter image description here

Upvotes: 1

Views: 6861

Answers (2)

Artem
Artem

Reputation: 7423

If it is on Windows you could also use testRigor, and enable OCR there which should allow you to do things like:

click "Jogar!"

Here is the documentation.

Upvotes: -1

bfris
bfris

Reputation: 5805

I said the tutorials in the official docs were great in my comment. And they are. But you have do some searching for the sample images. Many of them are here including the Messi picture used for the template matching tutorial.

This code works. If you are using TM_SQDIFF, then the best match will be found as a minimum. Also, you probably want the best match using cv2.minMaxLoc, rather than using a threshold.

import cv2
import numpy as np

screenshot = cv2.imread("screenshot.png", 0)
template = cv2.imread("template.png",  0)

res = cv2.matchTemplate(screenshot, template, cv2.TM_SQDIFF)

# threshold  = 0.1
# loc = np.where (res >= threshold)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)

print(min_loc)

which gives

(389, 412)

Screenshot:

enter image description here

Template

enter image description here

Upvotes: 2

Related Questions