What should be the size of input image for training a YOLOv3 Model Architecture CNN.?

I've implemented a YOLOv3 from scratch and I plan to fine-tune using MS-COCO weights for some different data. The dataset I've chosen has images of 720*1280 size.

When I go through the YOLOv3 paper, 1st CONV2d layer is there with filter_size =3 and stride = 1, and output size is 256*256....

Can someone give me a walkthrough for how YOLO training part works in here?

Upvotes: 1

Answers (1)

Venkatesh Wadawadagi

Reputation: 2943

From Yolov3 paper:

If best possible accuracy/mAP is what you want then use 608 x 608 as input layer size in the config.
If you want good inference/speed at the cost of accuracy then use, 320 x 320
If balanced model is what you want then use 416 x 416

Note that first layer automatically resizes your images to the size of first layer in Yolov3 CNN, so you need not convert your 1280 x 720 images to the input layer size.

Suggest you to read following things:

To understand how Yolov3 works, read this blog post.
To understand some basic stuff read from original site
Learn how to train your custom object detector here

Upvotes: 2

What should be the size of input image for training a YOLOv3 Model Architecture CNN.?

Answers (1)

Related Questions