Matt
Matt

Reputation: 2863

Using OpenCV to touch and select object

I'm using the OpenCV framework in iOS Xcode objc, is there a way that I could process the image feed from the video camera and allow the user to touch an object on the screen then we use some functionality in OpenCV to highlight it.

Here is graphically what I mean. The first image shows an example of what the user might see in the video feed:

enter image description here

Then when they tap on the screen on the ipad i want to use OpenCV feature/object detecting to process the area they've clicked to highlight the area. Would look something like this if they clicked the ipad:

enter image description here

Any ideas on how this would be achievable in objc OpenCV?

I can see quite easily how we could achieve this using trained templates of the iPad to match it using OpenCV algorithms but I want to try and get it dynamic so users can just touch anything in the screen and we'll take it from there?

Upvotes: 1

Views: 2550

Answers (3)

ibezito
ibezito

Reputation: 5822

Explanation: why should we use the segmentation approach

According to my understanding, the task which you are trying to solve is segmentation of objects, regardless to their identity.

The object recognition approach is one way to do it. But it has two major downsides:

  1. It requires you to train an object classifier, and to collect a dataset which contains a respectable amount of examples of objects which you would like to recognize. If you choose to take a classifier which is already trained - it won'y necessarily work on any type of object which you would like to detect.
  2. Most of the object recognition solutions find a bounding box around the recognized object, but they don't perform a complete segmentation of it. The segmentation part requires extra effort.

Therefore, I believe that the best way for your case is to use an image segmentation algorithms. More precisly, we'll be using the GrabCut segmentation algorithm.

The GrabCut algorithm

This is an iterative algorithm with two stages:

  1. initial stage - the user specify a bounding box around the object. given this bounding box the algorithm estimates the color distribution of foreground (the object) and the background by using GMM, followed by a graph cut optimization for finding the optimal boundaries between the foreground and the background.

  2. In the next stage, the user may correct the segmentation if needed, by supplying scribbles of the foreground and the background. The algorithm fixes the model accordingly and perform a new segmentation based on the updated information.

Using this approach has pros and cons. The pros:

  1. The segmentation algorithm is easy to implement with openCV.
  2. It enables the user to fix segmentation errors if needed.
  3. It doesn't relies on a collecting a dataset and training a classifier.

The main con is that you will need an extra source of information from the user beside of a single tap on the screen. This information will be a bounding box around the object, and in some cases - additional scribbles will be required to correct the segmentation.

Code

Luckily, there is an implementation of this algorithm in OpenCV. The user Itseez create a simple and easy to use sample for using OpenCV's GrabCut algorithm, which can be found here: https://github.com/Itseez/opencv/blob/master/samples/cpp/grabcut.cpp

Application usage:

The application receives a path to an image file as an command line argument input. It renders the image onto the screen and the user is required to supply an initial bounding rect.

The user can press 'n' in order to perform the segmentation for the current iteration or press 'r' to revert his operation.

After choosing a rect, the segmentation is calculated. If the user wants to correct it, he may choose to add foreground or background scribbles by pressing shift+left and Ctrl+left accordingly.

Examples

Segmenting the iPod:

enter image description here

Segmenting the pen:

enter image description here

Upvotes: 1

taarraas
taarraas

Reputation: 1483

The task which you are trying to solve is "Object proposal". It doesn't work very accurate and this results are very new. This two articles give you a good overview of methods for this: https://pdollar.wordpress.com/2013/12/10/a-seismic-shift-in-object-detection/ https://pdollar.wordpress.com/2013/12/22/generating-object-proposals/

To have state-of-the-art results, look for latest CVPR papers on Object proposals. Quite often they have code available to test.

Upvotes: 0

Abhishek
Abhishek

Reputation: 69

You Can do it by Training a Classifier of Ipad images using opencv Haar Classifiers and then detecting Ipad images in a given frame.

Now based on coordinates of the touch check if that area overlapped with detected Ipad image area. If it does Drawbounding box on the detected Object.Means from there on you can proceed towards processing your detected ipad image.

Repeat the above procedure for Number of objects that you want to detect.

Upvotes: 0

Related Questions