Can I train something to detect objects on a screen and click in appropriate location based on results?

Question

I will outline the specific characteristics of my request and then elaborate:

Identify the screen co-ordinates of specific strings appearing with graphics etc
Identify specific 'objects' on the screen, i.e. a simple object such as a rectangle with text or a circle with a smiley in

A good example is online poker.

    P1---------P2---------P3
  c1 c2      c3 c4      c5 c6
    |                     |
    |    s1 s2 s3 s4 s5   |
    |                     |
  c7 c8      c9 c10    c11 c12
    P4---------P5---------P6

Rules:

Players (Pn) 1 - 6 sit around a table
You can cannot guarantee to sit on the same seat
Each player has 2 cards (cn) which sit near then and only they can see
There are 5 shared (sn) cards in the center

Description:

Your name is 'P1' and the algorithm searches for this string to find your location on the screen
It knows the cards near you and identifies them
It can count how many players
It can read the shared cards on the table
It processes the information and click on the appropriate button, i.e. call, raise, fold

The example sums up the desired characteristics well as I may not properly articulate them.

Can machine learning be applied to this problem? Is this a particularly difficult task I am discussing? Any other advice?

Can I train something to detect objects on a screen and click in appropriate location based on results?

Answers (1)

Related Questions