user14017884
user14017884

Reputation: 11

What should be the size of input image for training a YOLOv3 Model Architecture CNN.?

I've implemented a YOLOv3 from scratch and I plan to fine-tune using MS-COCO weights for some different data. The dataset I've chosen has images of 720*1280 size.

When I go through the YOLOv3 paper, 1st CONV2d layer is there with filter_size =3 and stride = 1, and output size is 256*256....

Can someone give me a walkthrough for how YOLO training part works in here?

Upvotes: 1

Views: 6787

Answers (1)

Venkatesh Wadawadagi
Venkatesh Wadawadagi

Reputation: 2943

From Yolov3 paper:

  • If best possible accuracy/mAP is what you want then use 608 x 608 as input layer size in the config.
  • If you want good inference/speed at the cost of accuracy then use, 320 x 320
  • If balanced model is what you want then use 416 x 416

Note that first layer automatically resizes your images to the size of first layer in Yolov3 CNN, so you need not convert your 1280 x 720 images to the input layer size.

Suggest you to read following things:

  • To understand how Yolov3 works, read this blog post.
  • To understand some basic stuff read from original site
  • Learn how to train your custom object detector here

Upvotes: 2

Related Questions