Detect Areas of Text in Screenshot

Question

I'm working on a project to increase the ability for wine to automatically test software packages. What I'm looking to do now is detect text in the screen capture of the current window. I can then parse all of the text and use autohotkey to give a mouse click on the coordinates of the text I want.

For example, in firefox, I might want to test different things, the first open being opening preferences. I would then need to parse the screenshot of firefox, detect all of the separate locations of text. I can then run these separate images of text into tesseract-ocr and detect which one, says "Edit". I then redo this again for "preferences".

I've tried to find a solution but so far can't find anything. I'd prefer a solution that uses python or has python binds as thats what I've been programing in so far.

Giuseppe Cardone · Accepted Answer

A possible starting point is Project SIKULI. It is a tool to automate GUI testing. It is written in Java, nonetheless it includes a scripting environment based on Jython, hence modifying it to support python script may be not too difficult.

Detect Areas of Text in Screenshot

Answers (1)

Related Questions