Reputation: 21
I am using the tensorflow object detection api (https://github.com/tensorflow/models/tree/master/research/object_detection) to train a CNN using Single Shot Multibox detector (SSD) and then detect the objects in images/videos. Is there any way that I can implement a heatmap in the network to increase the accuracy of the model? If not, can you suggest any other way that I can improve the model?
Thanks in advance
Upvotes: 2
Views: 2434
Reputation: 111
The answer depends on what you really want to get from this work. If you just want to have a model which works fine without understanding what happens under the hood, I would suggest you to try some high level services like AutoML from the Google Cloud Platform.
Instead, if you're interested in the technology behind that, you should first read the original SSD paper to get a deep understanding about how it works. After that, you can try playing with the parameters you find in the .config file.
I would personally start from changing the feature exctractor to use a more accurate (but probably slower) one, like VGG16 or ResNet, compared to MobileNet (fast but less accurate).
Then you can try to change the size of the anchors and their shape, always from the .config file.
However, I really recommend not to use a trial and error approach, because there's a high probability you'll end up loosing a huge amount of time waiting for your trainings to end. There're some simple but useful techniques to avoid this: for example, I would suggest you to try with the most powerful configuration you can and without regularization, to see if a long training is able to produce a model which at least overfits your dataset and works well on that one. If this doesn't work, it means your network is not deep/large enough and you should work on that.
In my personal experience, I found preprocessing the image less useful than what I expected, in particular when using transfer learning.
Upvotes: 2