Karim Walid
Karim Walid

Reputation: 83

How to detect an image and click it with pyautogui?

I wanted to learn how to make the bot click the image, I tried watching yt tutorials but I can't find where's the mistake in the code, cause this is literally the first time for me using python, I tried the following code:

from pyautogui import *
import pyautogui
import time
import keyboard
import random
import win32api, win32con

time.sleep(5)

def click():
    win32api.mouse_event(win32con.MOUSEEVENTF_LEFTDOWN,0,0)
    win32api.mouse_event(win32con.MOUSEEVENTF_LEFTUP,0,0)

while keyboard.is_pressed('q') == False:
    flag = 0
    
    if pyautogui.locateOnScreen('benz.png', region=(0,0,1366,768), grayscale=True, confidence=0.5) != None:
                flag = 1
                click()
                time.sleep(0.05)
                break

                
                if flag == 1:
                 break

But I kept getting:

Traceback (most recent call last):
  File "c:\Program Files\Karim\autoclicker\main+stickman.py", line 17, in <module>
    if pyautogui.locateOnScreen('benz.png', region=(0,0,1366,768), grayscale=True, confidence=0.5) != None:
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyautogui\__init__.py", line 175, in wrapper
    return wrappedFunction(*args, **kwargs)
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyautogui\__init__.py", line 213, in locateOnScreen
    return pyscreeze.locateOnScreen(*args, **kwargs)
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyscreeze\__init__.py", line 373, in locateOnScreen
    retVal = locate(image, screenshotIm, **kwargs)
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyscreeze\__init__.py", line 353, in locate
    points = tuple(locateAll(needleImage, haystackImage, **kwargs))
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyscreeze\__init__.py", line 207, in _locateAll_opencv
    needleImage = _load_cv2(needleImage, grayscale)
  File "C:\Users\bayan\AppData\Local\Programs\Python\Python310\lib\site-packages\pyscreeze\__init__.py", line 170, in _load_cv2
    raise IOError("Failed to read %s because file is missing, "
OSError: Failed to read benz.png because file is missing, has improper permissions, or is an unsupported or invalid format

Note: The benz.png file is in the same folder with the code, it's in png format, and is actually a photo (means it opens and shows a photo when you double click it)

There's probably a dumb mistake in the code that I don't know because I know almost nothing about python 🙃

Upvotes: 3

Views: 23092

Answers (3)

jahantaila
jahantaila

Reputation: 908

PyAutoGUI has a built in function called locateOnScreen() which returns the x, y coordinates of the center of the image if it can find it on the current screen (it takes a screenshot and then analyzes it).

The image has to match exactly for this to work; i.e. if you want to click on a button.png that button picture has to be the same exact size / resolution as the button in your windows for the program to recognize it. One way to achieve this is to take a screenshot, open it in paint and cut out only the button you want pressed (or you could have PyAutoGUI do it for you as I'll show in a later example).

import pyautogui

question_list = ['greencircle', 'redcircle', 'bluesquare', 'redtriangle']

user_input = input('Where should I click? ')

while user_input not in question_list:
    print('Incorrect input, available options: greencircle, redcircle, bluesquare, redtriangle')
    user_input = input('Where should I click?')

location = pyautogui.locateOnScreen(user_input + '.png')
pyautogui.click(location)

The above example requires you to already have greencircle.png and all the other .png in your directory

PyAutoGUI can also take screenshots and you can specify which region of the screen to take the shot pyautogui.screenshot(region=(0, 0, 0, 0)) The first two values are the x,y coordinates for the top left of the region you want to select, the third is how far to the right(x) and the fourth is how far down (y).

This following example takes a screenshot of the Windows 10 Logo, saves it to a file, and then clicks on the logo by using the specified .png file

import pyautogui

pyautogui.screenshot('win10_logo.png', region=(0, 1041, 50, 39))
location = pyautogui.locateOnScreen('win10_logo.png')
pyautogui.click(location)

You also don't have to save the screenshot to a file, you can just save it as a variable

import pyautogui

win10 = pyautogui.screenshot(region=(0, 1041, 50, 39))
location = pyautogui.locateOnScreen(win10)
pyautogui.click(location)

Making a program detect if a user has clicked in a certain area (let's say, the windows 10 logo) would require another library like pynput.

from pynput.mouse import Listener    

def on_click(x, y, button, pressed):
    if 0 < x < 50 and 1080 > y > 1041 and str(button) == 'Button.left' and pressed:
        print('You clicked on Windows 10 Logo')
        return False    # get rid of return statement if you want a continuous loop

with Listener(on_click=on_click) as listener:
    listener.join()

PUTTING IT ALL TOGETHER

import pyautogui
from pynput.mouse import Listener

win10 = pyautogui.screenshot(region=(0, 1041, 50, 39))
location = pyautogui.locateOnScreen(win10)

# location[0] is the top left x coord
# location[1] is the top left y coord
# location[2] is the distance from left x coord to right x coord
# location[3] is the distance from top y coord to bottom y coord

x_boundary_left = location[0]
y_boundary_top = location[1]
x_boundary_right = location[0] + location[2]
y_boundary_bottom = location[1] + location[3]


def on_click(x, y, button, pressed):
    if x_boundary_left < x < x_boundary_right and y_boundary_bottom > y > y_boundary_top and str(button) == 'Button.left' and pressed:
        print('You clicked on Windows 10 Logo')
        return False    # get rid of return statement if you want a continuous loop


with Listener(on_click=on_click) as listener:
    listener.join()

Upvotes: 7

Karim Walid
Karim Walid

Reputation: 83

I edited:

if pyautogui.locateOnScreen('benz.png', region=(0,0,1366,768), grayscale=True, confidence=0.5) != None:

to:

if pyautogui.locateOnScreen('C:/Program Files/Karim/Others/benz.png', region=(0,0,1366,768), grayscale=True, confidence=0.5) != None:

and it worked 👍

Upvotes: 1

Mohammad
Mohammad

Reputation: 3396

It could be a permission problem due to pyautogui running in more than one instance of the script and being unable to access the correct file.

In any case, you could work around the issue by reading the file directly, e.g:

from cv2 import imread
image = imread('benz.png')
if pyautogui.locateOnScreen(image,... # and so on 

Upvotes: 2

Related Questions