Reputation: 4926
I have images of around 2000 X 2000
pixels. The objects that I am trying to identify are of smaller sizes (typically around 100 X 100
pixels), but there are lot of them.
I don't want to resize the input images, apply object detection and rescale the output back to the original size. The reason for this is I have very few images to work with and I would prefer cropping (which would lead to multiple training instances per image) over resizing to smaller size (this would give me 1 input image per original image).
Is there a sophisticated way or cropping and reassembling images for object detection, especially at the time of inference on test images?
For training, I suppose I would just take out the random crops, and use those for training. But for testing, I want to know if there is a specific way of cropping the test image, applying object detection and combining the results back to get the output for the original large image.
Upvotes: 1
Views: 1279
Reputation: 1
I guess using several (I've never tried) networks simultaneously is a choice, for you, using 4*4 (500+50 * 500+50) with respect to each 1*1), then reassembling at the output stage, (probably with NMS at the border since you mentioned the target is dense).
But it's weird.
You know one insight in detection with high resolution images is altering the backbone with "U" shape shortcut, which solves some problems without resize the images. Refer U-Net.
Upvotes: 0