user9045471
user9045471

Reputation: 19

How to understand the CNN's convolutional layer's output depth is the number of convolution filter used ?

Let's say my input to the conv layer is 256 x 256 x 64 and I use 32 filters of 3 x 3, why the output depth is 32 not 64? How does convolution carry out in depth axis?

Upvotes: 1

Views: 1902

Answers (2)

vikram meena
vikram meena

Reputation: 299

Output depth is dependent on no of filters in current convolution layer.

when you define a filter then you provide spatial dimension (X, Y) and last dimension of filter is based on input channels.

Since in convolution each filter (3, 3, z) convolves( in simple words element-vise multiplication) with input (Xin,Yin,z) produces (Xout,Yout,1) dimension output. Each filter produces single layer output that's why depth of output is equal to no of filters in that convolution layer.

Note: If you are interested in output size of layer then you can calculate using, This slide explains how convolution works.

X_out=(Xin+2Pad-kernal)/Stride

Convolution Image

You may want to watch the video of the lecture by prof Fei-Fei Li. Here is complete Deep learning course by her.

Upvotes: 1

John
John

Reputation: 1272

In case of CNN each filter is defined by its length and width (3 x 3). connectivity along the depth axis is always equal to the depth of input.

Taking your example:

you have 32 filters and each filter is of size (3x3). So a neuron, in each filter, will look at the patch of (3x3x64) of input and the number of neurons in each filter layer(without zero padding and stride=1) will be (256-3).
So, Neuron1 in each layer i.e. Layer1, Layer2, ....,Layer32 look at the same patch of (3x3x64) of the input. And it makes the output size of the convolutional layer to (253x253x32).
This is how number of filter determines the Depth of CNN.
To get more clear picture what i have discussed over here you can refer to this link. This explains CNN far more intuitively and mathematically.

Upvotes: 1

Related Questions