Song Tùng
Song Tùng

Reputation: 171

Trouble understanding Fully-Convolution Network

I'm trying to follow this introduction to how to use VGG-16 to do some Semantic Segmentation. However, there are a few aspect of the tutorial that were unclear to me:

Here is the detailed of their implementation:

We’ll implement FCN-8, as detailed step-by-step below:

  • Encoder: A pre-trained VGG16 is used as an encoder. The decoder starts from Layer 7 of VGG16.
  • FCN Layer-8: The last fully connected layer of VGG16 is replaced by a 1x1 convolution.

...

What I don't understand is what is layer 7 or 8? Could someone offer me some explanation?

Upvotes: 0

Views: 85

Answers (1)

Marcin
Marcin

Reputation: 1381

Layer 7 is just the last convolutional layer of VGG16.

Layer 8 would be a fully connected/dense layer, but instead is replaced with a 1x1 convolution which, if input to this layer would be of size 1x1, is the same thing. Otherwise its as if you mapped each pixel through the same dense layer.

If your input size is such that input to layer 8 is of size 1x1, then there is no difference. If its bigger, this still potentially allows the network to work in a meaningful way.

Upvotes: 1

Related Questions