batuman
batuman

Reputation: 7304

Convolution layer in CNN

We know that Convolution layer in CNN uses filters and different filters will look for different information in the input image.

But let say in this SSD, we have prototxt file and it has specification for the convolution layer as

layer {
  name: "conv2_1"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2_1"
  param {
    lr_mult: 1.0
    decay_mult: 1.0
  }
  param {
    lr_mult: 2.0
    decay_mult: 0.0
  }
  convolution_param {
    num_output: 128
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0.0
    }
  }
}

All convolution layers in different networks like (GoogleNet, AlexNet, VGG etc) are more or less similar. Just look at that and how to understand, filters in this convolution layer try to extract which information of the input image?

EDIT: Let me clarify for my question. I see two convolutions layer from the prototxt file as follows. They are from SSD.

layer {
  name: "conv1_1"
  type: "Convolution"
  bottom: "data"
  top: "conv1_1"
  param {
    lr_mult: 1.0
    decay_mult: 1.0
  }
  param {
    lr_mult: 2.0
    decay_mult: 0.0
  }
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0.0
    }
  }
}

layer {
  name: "conv2_1"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2_1"
  param {
    lr_mult: 1.0
    decay_mult: 1.0
  }
  param {
    lr_mult: 2.0
    decay_mult: 0.0
  }
  convolution_param {
    num_output: 128
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0.0
    }
  }
}

Then I print here of their outputs

Data

enter image description here

conv1_1 and conv2_1 images are here and here.

So my query is how these two conv layers produced different output. But no difference in prototxt file.

Upvotes: 0

Views: 1705

Answers (2)

Hossein Kashiani
Hossein Kashiani

Reputation: 330

enter image description here The filters at earlier layers represent low-level features like edges (These features retain higher spatial resolution for precise localization with low-level visual information similar to the response map of Gabor filters). On the other hand, the filter at the mid-layer extract features like corners or blobs, which are more complex.

out

And as you go deeper you can not visualize and interpret these features, because filters in mid-level and high-level layers are not directly connected to the input image. For instance, when you get the output of the first layer you can actually visualize and interpret it as edges but when you go deeper and apply second convolution layer to these extracted edges (the output of the first layer), then you get something like edges of edges ( or sth like this) and capture more semantic information and less fine-grained spatial details. In the the prototxt file all convolutions and other types of operation can resemble each other. But they extract different kinds of features, because of having different order and weights. enter image description here

Upvotes: 4

Shai
Shai

Reputation: 114786

"Convolution" layer differ not only in their parameters (e.g., kernel_size, stride, pad etc.) but also in their weights: the trainable parameters of the convolution kernels.
You see different output (aka "responses") because the weights of the filters are different.

See this answer regarding the difference between "data" blobs and "parameter/weights" blobs in caffe.

Upvotes: 1

Related Questions