Kenji Theodore Kusuma
Kenji Theodore Kusuma

Reputation: 33

Yolo Training: multiple objects in one image

I have a set of training images that contain many small objects (10-20). The image resolution is high (9000x6000).

Is it better to split the image into the specific objects before running yolo training? Or just leave it as is.

Does yolo resize an entire image, or does it ‘extract’ the annotated object first before resizing?

If it is the former, I am concerned that the resolution will be bad. Imagine 20 objects in a 416x416 image.

Upvotes: 3

Views: 3469

Answers (1)

Venkatesh Wadawadagi
Venkatesh Wadawadagi

Reputation: 2943

Does yolo resize an entire image, or does it ‘extract’ the annotated object first before resizing?

Yes, an entire image will be resized in case of Yolo and it does not extract annotated object before resizing.

Since your input images have very high resolution, what you can do is:

  • Yolo can handle object sizes of 25 x 25 effectively with network input layer size 608 x 608. So if your object sizes in original input image are greater than 250 x 250 you can train the images as they are (with 608 x 608 network size). In that case even when images are resized to network size, objects will be of size greater than 25x25. This should give you good accuracy. (6000/600) * 25 = 250

  • If object sizes in original images are smaller than 200 x 200, split your input image into 8 smaller units/blocks, say blocks/tiles of 2250 x 1500. Train these blocks as individual images. Each bigger image (9000 x 6000) corresponds to 8 training images. Each image might contain zero to many objects. You can operate in sliding window method.

  • The method you choose for training should be used for inference as well.

For training on objects of all sizes use following models: [Use this if you use original image as it is used for training]

If all of the objects that you want to detect are of smaller size, then for effective detection use Yolov4 with following changes: [Use this if you split original image into 8 blocks]

References:

  • Refer this relevant GitHub thread
  • darknet documentation

Upvotes: 2

Related Questions