Yu Da Chi
Yu Da Chi

Reputation: 109

How to train your own(w/o YOLO etc.) object detector in tf/keras

I successfully trained multi-classificator model, that was really easy with simple class related folder structure and keras.preprocessing.image.ImageDataGenerator with flow_from_directory (no one-hot encoding by hand btw!) after i just compile fit and evaluate - extremely well done pipeline by Keras!

BUT! when i decided to make my own (not cats, not dogs, not you_named) object detector - this is became a nightmare...

TFRecord and tf.Example are just madness! but ok, i almost get it (my dataset is small, i have plenty of ram, but who cares, write f. boilerplate, so much meh...)

The main thing - i just can't find any docs/tutorial how to make it with plain simple tf/keras, everyone just want to build up it on top of someone model, YOLO SSD FRCNN, even if they trying to detect completely new objects!!!

There two links about OD in official docs, and they both using some models underneath.

So my main question WHY ??? or i just blind..? -__-

Upvotes: 1

Views: 1256

Answers (2)

Jesus Gazol
Jesus Gazol

Reputation: 26

I am having pretty much the same problem as my images are B/W diagrams, quite different from regular pictures, I want to train a custom model on just only diagrams.

I have found this documentation section in Tensorflow models repo: https://github.com/tensorflow/models/blob/master/research/object_detection/README.md

It has a couple of sections explaining how to bring your own model and dataset in "extras" that could be a starting point.

Upvotes: 0

Frederik Bode
Frederik Bode

Reputation: 2744

It becomes a nightmare because Object Detection is way way harder than classification. The most simple object detector is this: first train a classifier on all your objects. Then when you want to detect objects in your image, slide a window over your image, and classify each window. Then, if your classifier is certain that a certain window is one of the objects, mark it as a successful detection.

But this approach has a lot of problems, mainly it's way (like waaaay) too slow. So, researcher improved it and invented RCNNs. That had it problems, so they invented Faster-RCNN, YOLO and SSD, all to make it faster and more accurate. You won't find any tutorials online on how to implement the sliding window technique because it's not useful anyway, and you won't find any tutorials on how to implement the more advanced stuff because, well, the networks get complicated pretty quick.

Also note that using YOLO doesn't mean you should use the same weights as in YOLO. You can always train YOLO from scratch on your own data if you want by randomly initiliazing all the weights in the network layers. So the even if they trying to detect completely new objects!!! you mentioned isn't really valid. Also also note that I still would advise you to do use the weights they used in Yolo network. Transfer Learning is generally looked at as being a good idea, especially when starting out and especially in the image processing world, as many images share common features (like edges, for example).

Upvotes: 1

Related Questions