Farahats9
Farahats9

Reputation: 555

Darknet YOLO image size

I am trying to train custom object classifier in Darknet YOLO v2 https://pjreddie.com/darknet/yolo/

I gathered a dataset for images most of them are 6000 x 4000 px and some lower resolutions as well.

Do I need to resize the images before training to be squared ?

I found that the config uses:

[net]
batch=64
subdivisions=8
height=416
width=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

thats why I was wondering how to use it for different sizes of data sets.

Upvotes: 30

Views: 67992

Answers (5)

Nouman Ahsan
Nouman Ahsan

Reputation: 367

You don't need to resize your database images. PJReddie's YOLO architecture does it by itself keeping the aspect ratio safe (no information will miss) according to the resolution in .cfg file. For Example, if you have image size 1248 x 936, YOLO will resize it to 416 x 312 and then pad the extra space with black bars to fit into 416 x 416 network.

Upvotes: 13

kascesar
kascesar

Reputation: 11

By default the darknet api changes the size of the images in both inference and training, but in theory any input size w, h = 32 x X where X belongs to a natural number should, W is the width, H the height. By default X = 13, so the input size is w, h = (416, 416). I use this rule with yolov3 in opencv, and it works better the bigger X is.

Upvotes: 1

Trinath Reddy
Trinath Reddy

Reputation: 21

You do not need to resize the images, you can directly change the values in darknet.cfg file.

  1. When you open darknet.cfg (yolo-darknet.cfg) file, you can all
    hyper-parameters and their values.
  2. As showed in your cfg file images dimensions are (416,416)->(weight,height), you can change the values, so that darknet will automatically resize the images before training.
  3. Since the images have high dimensions, you can adjust batch and sub-division values (lower the values 32,16,8 . it has to be multiples of 2), so that darknet will not crash (memory allocation error)

Upvotes: 2

Nerxis
Nerxis

Reputation: 3917

You don't have to resize it, because Darknet will do it instead of you!

It means you really don't need to do that and you can use different image sizes during your training. What you posted above is just network configuration. There should be full network definition as well. And the height and the width tell you what's the network resolution. And it also keeps aspect ratio, check e.g this.

Upvotes: 37

David Parks
David Parks

Reputation: 32051

It is very common to resize images before training. 416x416 is slightly larger than common. Most imagenet models resize and square the images to 256x256 for example. So I would expect the same here. Trying to train on 6000x4000 is going to require a farm of GPUs. The standard process is to square the image to the largest dimension (height, or width), padding with 0's on the shorter side, then resizing using standard image resizing tools like PIL.

Upvotes: 10

Related Questions