How Convolution Layer takes 6 input and gives 16 output?

Question

We're trying to build a forward convolutional neural network on FPGA. The configuration of our build is based on LeNet-5 architecture.

In the first convolution layer, there is no problem. Just 1 input (photo) and gives 6 output (6 feature map) with 6 (5*5) filter.

By the way, we trained our network and data on spyder-tensorflow etc.

But at the second convolution layer, there is 6 input (which are outputs of first max pooling layer) and 16 output with 16 (5*5*6) filter. Our research asistant said to us that "you have 6 input and (5*5) filter which has depth of 6. It means every input corresponds the filters neighbour depth of filter. At the end of the convolution, you can sum all of the multiplication results so that you have just 1 output for 1 filter."

But in which process we will sum the multiplication results.

In python/spyder/tensorflow conv2d function doing something and we get the results. but in hardware, I must know how this proceed.

Thank you for help. Sorry my english.

Here is the explanation with picture

Tinu · Accepted Answer

take a moment and have a look at this:

http://cs231n.github.io/assets/conv-demo/index.html

I found this gif very helpful when learning how convolution is calculated and done in detail. Hopefully, this helps you understand how it is proceeded in "hardware".

How Convolution Layer takes 6 input and gives 16 output?

Answers (1)

Related Questions