Hilton Pintor
Hilton Pintor

Reputation: 300

Recognize specific images, not the objects in the images

I need to recognize specific images using the iPhone camera. My goal is to have a set of 20 images, that when a print or other display of one of them is present in front of the camera, the app recognizes that image.

I thought about using classifiers (CoreML), but I don't think it would give the intended result. For example, if I had a model that recognizes fruits, and then I showed it two different pictures of a banana, It would recognize them both as bananas, which is not what I want. I want my app to recognize specific images, regardless of its content.

The behavior I want is exactly what ARToolKit does (https://www.artoolkit.org/documentation/doku.php?id=3_Marker_Training:marker_nft_training), but I do not wish to use this library.

So my question is: Are the any other libraries, or other ways, for me to recognize specific images from the camera on iOS (preferably in Swift).

Upvotes: 0

Views: 778

Answers (2)

Hilton Pintor
Hilton Pintor

Reputation: 300

Answering my own question.

I ended up following this awesome tutorial that uses OpenCV to recognize specific images, and teaches how to make a wrapper so this code can be accessed by Swift.

Upvotes: 0

DoesData
DoesData

Reputation: 7047

Since you are using images specific to your use case there isn't going to be an existing model that you can use. You'd have to create a model, train it, and then import it into CoreML. It's hard to provide specific advice since I know nothing about your images.

As far as libraries are concerned checkout this list and Swift-AI.

Swift-AI has a neural network that you might be able to train if you had enough images.

Most likely you will have to create the model in another language, such as Python and then import it into your Xcode project.

Take a look at this question.

This blog post goes into some detail about how to train your own model for CoreML.

Keras is probably your best bet to build your model. Take a look at this tutorial.

There are other problems too though like you only have 20 images. This is certainly not enough to train an accurate model. Also the user can present modified versions of these images. You'd have to generate realistic sample of each possible image and then use that entire set to train the model. I'd say you need a minimum of 20 images of each image (400 total).

You'll want to pre-process the image and extract features that you can compare to the known features of your images. This is how facial recognition works. Here is a guide for facial recognition that might be able to help you with feature extraction.

Simply put without a model that is based on your images you can't do much.

Upvotes: 1

Related Questions