Saurabh Gaur
Saurabh Gaur

Reputation: 23805

How to find an element by image

As we know selenium supports multiple locators strategy to find an element on web page.

But my requirement is different, I have some sites where any locators supported by selenium is not enough to find an element uniquely.

As selenium gives facility to create own custom locator strategy to find an element, I am trying to create image locator which could be able to find an element using base64 String of a sub image as appium do.

Points for image locator:

  1. Launch browser with URL
  2. Capture screenshot of the page
  3. Detect x,y location of the sub image from screenshot
  4. Find element using x, y location from the page

To achieve this task I am creating custom Image locator as below :

public class ByImage extends By {

    String imageBase64String

    /**
     * @param imageBase64String
     */
    public ByImage(String imageBase64String) {
        this.imageBase64String = imageBase64String
    }

    @Override
    public List<WebElement> findElement(SearchContext context) {
        List<WebElement> els = findElements(context)
        if (els) {
            return els.get(0)
        }
        throw new NoSuchElementException("Element not found")
    }

    @Override
    public List<WebElement> findElements(SearchContext context) {
       //Get current screenshot
        byte[] screenshotByte = ((TakesScreenshot)context).getScreenshotAs(OutputType.BYTES))
        byte[] subImgToFindByte = DatatypeConverter.parseBase64Binary(imageBase64String)
        //Convert buffred image to get height and width of subimage
        BufferedImage bufferedSubImgToFind = ImageIO.read(new ByteArrayInputStream(subImgToFindByte ));

        //Here I need a mechanism to get coordinates of sub image from screenshot
        //Suppose I able to find x, y
        double x
        double y

        //Now find element using coordinates
        //Now calculate center point
        int centerX = int(x + (bufferedSubImgToFind.getWidth() / 2))
        int centerY = int(y + (bufferedSubImgToFind.getHeight() / 2))

        JavascriptExecutor js = ((JavascriptExecutor)context)

        return js.executeScript("return document.elementsFromPoint(arguments[0], arguments[1]);", centerX, centerY)
      }   
  }

Now the test case is as :

WebDriver driver = new ChromeDriver()
driver.get("<URL>")
WebElement elementByImage = driver.findElement(new ByImage("<Base64 String of the subimage>"))

I'm able to achieve everything except a better library to detect exact coordinates of subimage from an image to find an element using coordinates.

Could anyone suggest me a better approach to achieve this task?

Upvotes: 0

Views: 3392

Answers (2)

Saurabh Gaur
Saurabh Gaur

Reputation: 23805

As @Dmitri suggested I'm going with Java Bindings for OpenCV.

download appropriate OpenCV and extract it into classpath and try to get coordinates as :

import org.opencv.core.Core;
import org.opencv.core.Core.MinMaxLocResult;
import org.opencv.core.CvType;
import org.opencv.core.Mat;
import org.opencv.core.MatOfByte;
import org.opencv.core.Point;
import org.opencv.imgcodecs.Imgcodecs;
import org.opencv.imgproc.Imgproc;

byte[] screenshotByte = ((TakesScreenshot)context).getScreenshotAs(OutputType.BYTES))
byte[] subImgToFindByte = DatatypeConverter.parseBase64Binary(imageBase64String)

System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
Mat source = Imgcodecs.imdecode(new MatOfByte(screenshotByte), Imgcodecs.IMREAD_UNCHANGED);
Mat template = Imgcodecs.imdecode(new MatOfByte(subImgToFindByte), Imgcodecs.IMREAD_UNCHANGED);

int result_cols = source.cols() - template.cols() + 1;
int result_rows = source.rows() - template.rows() + 1;
Mat outputImage = new Mat(result_rows, result_cols, CvType.CV_32FC1);

// Template matching method
Imgproc.matchTemplate(source, template, outputImage, Imgproc.TM_SQDIFF_NORMED);

MinMaxLocResult mmr = Core.minMaxLoc(outputImage);
// Now get the point
Point point = mmr.minLoc;
double x = point.x;
double y = point.y;

//Now get the find the element using x, y after calculating center point.
int centerX = int(x + (bufferedSubImgToFind.getWidth() / 2));
int centerY = int(y + (bufferedSubImgToFind.getHeight() / 2));

WebElement el = js.executeScript("return document.elementFromPoint(arguments[0], arguments[1]);", centerX, centerY);

Hope It helps to everyone.

Upvotes: -2

Dmitri T
Dmitri T

Reputation: 168042

There are different options you can go for, like:

  1. You can use Java Bindings for OpenCV in order to look up the subimage in the main screenshot, check out Template Matching article for comprehensive explanation and code snippets.
  2. Project Sikuli provides some simple APIs for image recognition/interaction
  3. SeeTest Automation provides image recognition and Object Repository pattern implementation for image templates

Upvotes: 3

Related Questions