user3094631
user3094631

Reputation: 451

Can I view convolutional neural network as fully connected neural network

for example,there is a image of 3 by 3,

and a convolutional neural network that has two 2x2 filters convolves the image

in the end,the output dimention is 2x2x2

Can I view the above procedure as the followings?

because of 2x2 filter,after sliding over whole image,I get 4 small images

and use these 4 small images as input of a fully connected neural network

in the end,I can also get 8 output

I don't really know back propagation in CNN, so I am trying to understand it from classic fully connected neural network.

By inputting a small image,we can update the weights in the fully connected neural network once,is it the same thing that updating weights of filters in CNN ?

Am I thinking it right?

enter image description here

Upvotes: 0

Views: 134

Answers (1)

lejlot
lejlot

Reputation: 66795

In short, yes. You can see CNN as (among other possible interpretations):

  • neural net with convolutional operation and gradients computed directly for it (typical approach)
  • a fully connected network with weight sharing, so for simplicity lets assume that input is 1d of size 3 and you have kernel of size 2, so it looks like

    [X1 X2 X3] conv [w1 w2] = [X1w1+X2w2 X2w1+X3w3]

    which is equivalent to having a fully connected network with weights vij meaning "weight between ith input neuron and jth hidden"

    X1
    X2     h1 = X1v11+X2v21+X3v31
    X3     h2 = X1v12+X2v22+X3v32
    

    thus if you put v31=v12=0 and v11=v22, v21=v32 you get exactly the same network. By equality I mean literally that this is the same variable (thus the term weight sharing).

  • collection of small neural networks (again, with weight sharing) which are connected to different, small patches of input (which you are proposing). So your whole model looks like:

                       /-patch 1 -- shared fully connected net\ 
    Input --splitting----patch 2 -- shared fully connected net--merging---
                       .                                      .
                       .                                      . 
                       \-patch K -- shared fully connected net/ 
    

These are just three views of the same object, in both cases if you compute the partial derivatives (gradients) you will end up with exactly the same equations.

Upvotes: 3

Related Questions