user10024395
user10024395

Reputation: 1

Is there any particular reason why people pick 224x224 image size for imagenet experiments?

Is it that 224x224 gives better accuracy for some reason or just computational constraint? I would think that bigger picture should give better accuracy, no?

Upvotes: 23

Views: 18184

Answers (1)

Lucas Ramos
Lucas Ramos

Reputation: 453

Well bigger images contain more information that could either be relevant or not. The size of your input is important because the bigger the input, the more parameters your network will have to handle. More parameters may lead to several problems, first you'll need more computing power. Then you may need more data to train on, since a lot of parameters and not enough samples may lead to overfitting, specially with CNNs. The choice for a 224 from AlexNet also allowed them to apply some data augmentation.

For instance, if you have a 512x512 image and you want to recognize an object there it would be better to resample it to 256x256 and get smaller patches of 224x224 or 200x200, do some data augmentation and then train. You could also use patches of 400x400 and also do data augmentation and train, provided that you have enough data.

Don't forget to do cross-validation so you can check if there's overfitting.

Upvotes: 14

Related Questions