Creating custom object detection models

I am testing ImageAI object detection models like RetinaNet and YOLOv3 for image datasets. But the problem is, these models only support 80 different types of objects as shown below:

person,  bicycle,  car, motorcycle, airplane, bus, train,  truck,  boat,  traffic light,  fire hydrant, stop_sign,
parking meter,   bench,   bird,   cat,   dog,   horse,   sheep,   cow,   elephant,   bear,   zebra,
giraffe,   backpack,   umbrella,   handbag,   tie,   suitcase,   frisbee,   skis,   snowboard,
sports ball,   kite,   baseball bat,   baseball glove,   skateboard,   surfboard,   tennis racket,
bottle,   wine glass,   cup,   fork,   knife,   spoon,   bowl,   banana,   apple,   sandwich,   orange,
broccoli,   carrot,   hot dog,   pizza,   donot,   cake,   chair,   couch,   potted plant,   bed,
dining table,   toilet,   tv,   laptop,   mouse,   remote,   keyboard,   cell phone,   microwave,   oven,
toaster,   sink,   refrigerator,   book,   clock,   vase,   scissors,   teddy bear,   hair dryer,   toothbrush.

The objects (transformers) in my dataset are different from above-supported objects. What is the best way to create custom object detection models?
If I need to create my own dataset, how many images are enough to get a good accuracy?

Upvotes: 2

Answers (2)

Matt Hill

Reputation: 1106

A lot of people want to do this, with custom object detection. The answer from Nandu Raj is a great resource if you want to work directly with your own GPU hardware, install TensorFlow and manage everything yourself.

However, if you want a service, I would suggest the product I worked on, IBM Watson Visual Recognition. It's a cloud based service that requires no machine learning expertise or hardware. You use Watson Studio to train your own private model by drawing boxes around the objects you care about in something like 50 images and then click the train button. The number of examples needed will vary depending on the types of objects you want to find, of course.

However, you can get started very quickly, say labeling 20 images, then clicking train. Watson will do it's best and train a model in 15-20 minutes on that. Then you can use the auto-label feature to have that preliminary model suggest boxes for unlabeled images. Your task is then easier, you just correct any boxes that don't make sense, and click re-train.

Here's a demo based on Lego people: https://medium.com/@vincent.perrin/watson-visual-recognition-object-detection-in-action-in-5-minutes-8f97c4b613c3 Don't miss the GitHub link to the example data if you want to give it a spin.

It's free to use with up to 1000 images per month - sign up for the "Lite" plan:

Video demo of the GUI with Studio: https://www.youtube.com/watch?v=eW6_PCYFq-Y If you prefer using curl or a Python SDK, start here: https://cloud.ibm.com/docs/visual-recognition?topic=visual-recognition-getting-started-tutorial

Upvotes: 1

Nandu Raj

Reputation: 2110

Follow the steps mentioned here.:

https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10

This will be a good start

Upvotes: 2

Creating custom object detection models

Answers (2)

Related Questions