How does the size of the patch/kernel impact the result of a convnet?

I am playing around convolutional neural networks at home with tensorflow (btw I have done the udacity deep learning course, so I have the theory basis). What impact has the size of the patch when one runs a convolution? does such size have to change when the image is bigger/smaller?

One of the exercises I did involved the CIFAR-10 databaese of images (32x32 px), then I used convolutions of 3x3 (with a padding of 1), getting decent results.

But lets say now I want to play with images larger than that (say 100x100), should I make my patches bigger? Do I keep them 3x3? Furthermore, what would be the impact of making a patch really big? (Say 50x50).

Normally I would test this at home directly, but running this on my computer is a bit slow (no nvidia GPU!)

So the question should be summarized as

  1. Should I increase/decrease the size of my patches when my input images are bigger/smaller?
  2. What is the impact (in terms of performance/overfitting) of increasing/decreasing my path size?

Upvotes: 3

Views: 7735

Answers (2)

Kareem Jano
Kareem Jano

Reputation: 11

It depends more on the size of the objects you want to detect or in other words, the size of the receptive field you want to have. Nevertheless, choosing the kernel size was always a challenging decision. That is why the Inception model was created which uses different kernel sizes (1x1, 3x3, 5x5). The creators of this model also went deeper and tried to decompose the convolutional layers into ones with smaller patch size while maintaining the same receptive field to try to speed up the training (ex. 5x5 was decomposed to two 3x3 and 3x3 was decomposed to 3x1 and 1x3) creating different versions of the inception model.

You can also check the Inception V2 paper for more details https://arxiv.org/abs/1512.00567

Upvotes: 0

malioboro
malioboro

Reputation: 3291

If you are not using padding, larger kernel makes number of neuron in the next layer will be smaller.

Example: Kernel with size 1x1 give the next layer the same number of neuron; kernel with size NxN give only one neuron in the next layer.

The impact of larger kernel:

  • Computational time is faster, memory usage is smaller
  • Loss a lot of details. Imagine NxN input neuron and the kernel size is NxN too, then the next layer only gives you one neuron. Loss a lot of details can lead you to underfitting.

The answer:

  1. It depends on the images, if you needed a lot of details from the image you don't need to increase your kernel size. If your image is a 1000x1000 pixel large-version of MNIST image, I will increase the kernel size.
  2. Smaller kernel will gives you a lot of details, it can lead you to overfitting, but larger kernel will gives you loss a lot of details, it can lead you to underfitting. You should tune your model to find the best size. Sometimes, time and machine specification should be considered

If you are using padding, you can adjust so the result neuron after convolution will be the same. I can't said it will be better than not using padding, but the loss of more details still occurs than using smaller kernel

Upvotes: 5

Related Questions