Arnaud Hureaux
Arnaud Hureaux

Reputation: 131

How do successive convolutional layers work?

If my first convolution have 64 filters and my second has 32 filters. Will i have :

1 Image -> Conv(64 filters) -> 64 ImagesFiltred -> Conv(32 filters) -> 64 x 32 = 2048 Images filtred

Or :

1 Image -> Conv(64 filters) -> 64 ImagesFiltred -> Conv(32 filters) -> 32 Images filtred

If it is the second answer : what are goin on between the 64 ImagesFiltred and the second Conv ??

Thanks for your answer, in don't find a good tutorial that explain clearly, it always a rush ...

Upvotes: 0

Views: 1209

Answers (2)

Leila Abdelrahman
Leila Abdelrahman

Reputation: 208

Your first point is correct. Convolutions are essentially ways of altering and extracting features from data. We do this by creating m images, each looking at a certain frame of the original image. On this first convolutional layer, we then take n images for each convoluted image in the first layer.

SO: k1 *k2 would be the total number of images.

To further this point, a convolution works by making feature maps of an image. When you have successive convolutional layers, you are making feature maps of feature maps. I.e. if I start with 1 image, and my first convolutional layer is of size 20, then I have 20 images (more specifically feature maps) at the end of convolution 1. Then let's say I add a second convolution of size 10. What happens is then I am making 10 feature maps for every 1 image. Thus, it would be 20*10 images = 200 feature maps.

Let's say for example you have a 50x50 pixel image. Let's say you have a convolutional layer with a filter of size 5x5. What happens if you don't have padding or anything else) is that you "slide" across the image and get a weighted average of the pixels at each iteration of the slide (depending on your location). You would then get an output feature map of size 5x5. Let's say you do this 20 times then (i.e. a 5x5x20 convolution) You would then have as an output 20 feature maps of size 5x5. In the diagram mentioned in the VGG neural network post below, the diagram only shows the number of feature maps to made for the incoming feature maps NOT the end sum of feature maps.

I hope this explanation was thorough!

Upvotes: 2

Arnaud Hureaux
Arnaud Hureaux

Reputation: 131

Here we have the architecture of the VGG-16

In VGG-16 we have 4 convolutions : 64, 128, 256 512 And in the architecture we saw that we don't have 64 images, 64*128 images etc but just 64 images, 128 images etc

So the good answer was not the first but the second. And it imply my second questions :

"What are goin on between the 64 ImagesFiltred and the second Conv ??"

I think between a 64 conv et and 32 conv they are finaly only 1 filter but on two pixel couch so it divide the thickness of the conv by 2.

And between a 64 conv and a 128 conv they are only 2 filter on one pixel couch so ti multiply by 2 the thickness of the conv.

Am i right ?

Upvotes: 0

Related Questions