how do i get coordinates of image shown in opencv

Question

Sorry but title doesnt really make sense

i am trying to make an ai that clicks on the ball to make it bounce. for context heres a picture of the application

in the game when you click the ball it goes up and then comes back down and the aim of the game is to keep it up.

i have writen some code that turns the image into a mask with opencv, heres a picture of the result

what i now need to do is find the location of the ball in pixels/coordinates so i can make the mouse move to it and click it. By the way the ball has a margin on the left and right of it so it doesn't just go strait up and down but left and right too. Also the ball isnt animated,just a moving image.

How would i get the ball location in pixels/coordinates so i can move the mouse to it.

heres a copy of my code:

import numpy as np
from PIL import ImageGrab
import cv2
import time
import pyautogui


def draw_lines(img,lines):
    for line in lines:
        coords = line[0]
        cv2.line(img, (coords[0], coords[1]), (coords[2], coords[3]), [255,255,255], 3)

def process_img(original_image):
    processed_img = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
    processed_img = cv2.Canny(processed_img, threshold1=200, threshold2=300)
    vertices = np.array([[0,0],[0,800],[850,800],[850,0]
                         ], np.int32)
    processed_img = roi(processed_img, [vertices])

    # more info: http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_houghlines/py_houghlines.html
    #                          edges       rho   theta   thresh         # min length, max gap:        
    lines = cv2.HoughLinesP(processed_img, 1, np.pi/180, 180,      20,         15)
    draw_lines(processed_img,lines)
    return processed_img

def roi(img, vertices):
    #blank mask:
    mask = np.zeros_like(img)
    # fill the mask
    cv2.fillPoly(mask, vertices, 255)
    # now only show the area that is the mask
    masked = cv2.bitwise_and(img, mask)
    return masked
def main():
    last_time = time.time()
    while(True):
        screen =  np.array(ImageGrab.grab(bbox=(0,40, 800, 850)))
        new_screen = process_img(screen)
        print('Loop took {} seconds'.format(time.time()-last_time))
        last_time = time.time()
        cv2.imshow('window', new_screen)
        #cv2.imshow('window2', cv2.cvtColor(screen, cv2.COLOR_BGR2RGB))
        if cv2.waitKey(25) & 0xFF == ord('q'):
            cv2.destroyAllWindows()
            break

def mouse_movement():
    ##Set to move relative to where ball is
    pyautogui.moveTo(300,400)
    pyautogui.click();
main()

Sorry if this is confusing but brain.exe has stopped working :( Thanks

Mark Setchell · Accepted Answer

I was working on your other, related question when you deleted it and see you are having performance issues in locating the ball. As your ball appears to be on a nice, simple white background (apart from the score and the close button at top right), there are easier/faster ways of finding the ball.

First, work in greyscale so that you only have 1 channel, instead of 3 channels of RGB to process - that is generally faster.

Then, overwrite the score and menu at top-right with white pixels so that the only thing left in the image is the ball. Now invert the image so that all the whites become black, then you can use findNonZero() to find anything that is not the background, i.e. the ball.

Now find the lowest and highest coordinate in the y-direction and average them for the centre of the ball, likewise in the x-direction for the other way.

#!/usr/bin/env python3

# Load image - work in greyscale as 1/3 as many pixels
im = cv2.imread('ball.png',cv2.IMREAD_GRAYSCALE)

# Overwrite "Current Best" with white - these numbers will vary depending on what you capture
im[134:400,447:714] = 255

# Overwrite menu and "Close" button at top-right with white - these numbers will vary depending on what you capture
im[3:107,1494:1726] = 255

# Negate image so whites become black
im=255-im

# Find anything not black, i.e. the ball
nz = cv2.findNonZero(im)

# Find top, bottom, left and right edge of ball
a = nz[:,0,0].min()
b = nz[:,0,0].max()
c = nz[:,0,1].min()
d = nz[:,0,1].max()
print('a:{}, b:{}, c:{}, d:{}'.format(a,b,c,d))

# Average top and bottom edges, left and right edges, to give centre
c0 = (a+b)/2
c1 = (c+d)/2
print('Ball centre: {},{}'.format(c0,c1))

That gives:

a:442, b:688, c:1063, d:1304
Ball centre: 565.0,1183.5

which, if I draw a red box in shows:

The processing takes 845 microseconds on my Mac, or less than a millisecond, which corresponds to 1,183 frames per second. Obviously you have your time to grab the screen, but I can't control that.

Note that you could also resize the image down by a factor of say 4 (or maybe 8 or 16) in each direction and still be sure of finding the ball and that may make it even faster.

Keywords: Ball, track, tracking, locating, finding, position of, image, image processing, python, OpenCV, numpy, bounding box, bbox.

how do i get coordinates of image shown in opencv

Answers (2)

Related Questions