Reputation: 55
I am trying to detect small objects from ipcam videostreams using ssd mobilenetv2. The model was trained on the high resolution images of these small objects where the objects are very close to the camera.Images were downloaded from internet. I found that changing the anchorbox scales and modifying feature extractor.py are the proposed solutions to overcome this. Can anyone guide me how to do this?
Upvotes: 5
Views: 7405
Reputation: 432
Late to the party, posting for posterity. I had better luck with small objects using the ssd_mobilenet_v2_fpnlite... variants
read about fpn here https://towardsdatascience.com/review-fpn-feature-pyramid-network-object-detection-262fc7482610
Upvotes: 0
Reputation: 426
mobilenet-ssd - is great for large objects, yet its performance for small objects is pretty poor. It is always better to train with anchors tuned to the objects aspect ratios, and sizes you expect. One more thing to take into account is that the first branch is the one which detects the smallest objects - the resolution of this branch is 1/16 of the input - you should consider adding another branch at the 1/8 feature map - which will help with small objects.
How to change anchors sizes and aspect ratios: Let us take for example the pipeline.config file which is being used for the training configuration - https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v2_coco.config. You will find there the following arguments:
90 anchor_generator {
91 ssd_anchor_generator {
92 num_layers: 6
93 min_scale: 0.20000000298
94 max_scale: 0.949999988079
95 aspect_ratios: 1.0
96 aspect_ratios: 2.0
97 aspect_ratios: 0.5
98 aspect_ratios: 3.0
99 aspect_ratios: 0.333299994469
100 }
101 }
min_scale + (max_scale - min_scale)/(num_layers - 1) * (#branch)
(same as defined in SSD: Single Shot MultiBox Detector - https://arxiv.org/pdf/1512.02325.pdf)How to start the branches earlier
This also would be needed to be changed inside the code. Each predefined model has its own model file - i.e. ssd_mobilenet_v2: https://github.com/tensorflow/models/blob/master/research/object_detection/models/ssd_mobilenet_v2_feature_extractor.py
lines 111:117
feature_map_layout = {
'from_layer': ['layer_15/expansion_output', 'layer_19', '', '', '', ''
][:self._num_layers],
'layer_depth': [-1, -1, 512, 256, 256, 128][:self._num_layers],
'use_depthwise': self._use_depthwise,
'use_explicit_padding': self._use_explicit_padding,
}
You can choose what layers to start from by their name.
Now for my 2 cents, I didn't try mobilenet-v2-ssd, mainly used mobilenet-v1-ssd, but from my experience is is not a good model for small objects. I guess it can be optimized a little bit by editing the anchors, but not sure if it will be sufficient for your needs. for one stage ssd like network consider using ssd_mobilenet_v1_fpn_coco - it works on 640x640 input size, and its first branch is starts at 1/8 input size. (cons - bigger model, and higher inference time)
Upvotes: 12