The output size of theano.tensor.nnet.conv.conv2d

Question

The function that is currently being used widely on tutorials and other place is of the form:

conv_out = conv2d(
        input= x, # some 4d tensor 
        filters= w, # some shared variable
        filter_shape= [ nkerns, stack_size, filter_height, filter_width ],
        image_shape= [ batch_size, stack_size, height, width ]
    )

If for the first layer of a CNN, I have filter_shape as [ 20, 1 , 7, 7 ] which is the number of kernals being 20, each 7 X 7, what does the '1' stand for ? My image_shape is [100, 1, 84, 84 ].
This convolution now outputs a tensor of shape [ 100, 20, 26, 26] which I understand. My next layer now takes the parameters filter_shape = [50, 20, 5 ,5 ] , image_shape = [ 100, 20 ,26, 26 ] and produces a output of shape [ 100 ,50 ,11 ,11 ]. I seem to kind of understand this operation, except, if I want to use a '50' filters layer each working on previous 20 feature maps produced, shouldn't I produce 1000 feature maps in all instead of producing just 50 feature maps ? To restate my question, I have a stack of 20 feature maps each running 50 kernals of convolution, shouldn't my output shape be [100, 1000, 11, 11] instead of [ 100, 50 , 11, 11] ?

eickenberg · Accepted Answer

To answer your questions:

The 1 stands for the number of input channels. As you seem to be using gray scale images, this is one. For color images it can be 3. For other convolutional layers as in your second question, it must be equal to the number of outputs that the previous layer generated.
Using a filter of size [50, 20, 5, 5] on an input signal of [100, 20, 26, 26] is actually a good example for your first question, as well. You have here 50 filters of shape [20, 5, 5]. Every image is of shape [20, 26, 26]. The convolution uses all the 20 channels each time: Filter 0 gets applied to image channel 0, filter 1 gets applied to image 1, and the whole result gets summed up. Does that make sense?

The output size of theano.tensor.nnet.conv.conv2d

Answers (1)

Related Questions