user1251385
user1251385

Reputation: 197

How to detect if image is present on screen?

Here is the image I need to detect: http://s13.postimg.org/wt8qxoco3/image.png

Here is the base64 representation: http://pastebin.com/raw.php?i=TZQUieWe

The reason why I'm asking for your help is because this is a complex problem and I am not equipped to solve it. It will probably take me a week to do it by myself.

Some pseudo-code that I thought about:

1) Take screenshot of the app and store it as image object.

2) Convert binary64 representation of my image to image object.

3) Use some sort of algorithm/function to compare both image objects.

By on screen, I mean in an app. I have the app's window name and the PID.

To be 100% clear, I need to essentially detect if image1 is inside image2. image1 is the image I gave in the OP. image2 is a screenshot of a window.

Upvotes: 4

Views: 14774

Answers (3)

Sako73
Sako73

Reputation: 10147

It looks like https://python-pillow.org/ is a more updated version of PIL.

Upvotes: 0

terra823
terra823

Reputation: 92

this is probably the best place to start:

http://effbot.org/imagingbook/image.htm

if you don't have access to the image's meta data, file name, type, etc, what you're trying to do is very difficult, but your pseudo sounds on-point. essentially, you'll have to create an algorithmic model based on a photo's shapes, lines, size, colors, etc. then you'd have to match that model against models already made and indexed in some database. hope that helps.

Upvotes: 1

abarnert
abarnert

Reputation: 365975

If you break this down into pieces, they're all pretty simple.

First, you need a screenshot of the app's window as a 2D array of pixels. There are a variety of different ways to do this in a platform-specific way, but you didn't mention what platform you're on, so… let's just grab the whole screen, using PIL:

screenshot = ImageGrab.grab()
haystack = screenshot.load()

Now, you need to convert your base64 into an image. Taking a quick look at it, it's clearly just an encoded PNG file. So:

decoded = data.decode('base64')
f = cStringIO.StringIO(decoded)
image = Image.open(f)
needle = image.load()

Now you've got a 2D array of pixels, and you want to see if it exists in another 2D array. There are faster ways to do this—using numpy is probably best—but there's also a dumb brute-force way, which is a lot simpler to understand: just iterate the rows of haystack; for each one, iterate the columns, and see if you find a run of bytes that matches the first row of needle. If so, keep going through the rest of the rows until you either finish all of needle, in which case you return True, or find a mismatch, in which case you continue and just start again on the next row.

Upvotes: 3

Related Questions