Reputation: 451
for example,there is a image of 3 by 3,
and a convolutional neural network that has two 2x2 filters convolves the image
in the end,the output dimention is 2x2x2
Can I view the above procedure as the followings?
because of 2x2 filter,after sliding over whole image,I get 4 small images
and use these 4 small images as input of a fully connected neural network
in the end,I can also get 8 output
I don't really know back propagation in CNN, so I am trying to understand it from classic fully connected neural network.
By inputting a small image,we can update the weights in the fully connected neural network once,is it the same thing that updating weights of filters in CNN ?
Am I thinking it right?
Upvotes: 0
Views: 134
Reputation: 66795
In short, yes. You can see CNN as (among other possible interpretations):
a fully connected network with weight sharing, so for simplicity lets assume that input is 1d of size 3 and you have kernel of size 2, so it looks like
[X1 X2 X3] conv [w1 w2] = [X1w1+X2w2 X2w1+X3w3]
which is equivalent to having a fully connected network with weights vij meaning "weight between ith input neuron and jth hidden"
X1
X2 h1 = X1v11+X2v21+X3v31
X3 h2 = X1v12+X2v22+X3v32
thus if you put v31=v12=0
and v11=v22
, v21=v32
you get exactly the same network. By equality I mean literally that this is the same variable (thus the term weight sharing).
collection of small neural networks (again, with weight sharing) which are connected to different, small patches of input (which you are proposing). So your whole model looks like:
/-patch 1 -- shared fully connected net\
Input --splitting----patch 2 -- shared fully connected net--merging---
. .
. .
\-patch K -- shared fully connected net/
These are just three views of the same object, in both cases if you compute the partial derivatives (gradients) you will end up with exactly the same equations.
Upvotes: 3