Can't understand how filters in a Conv net are calculated
I've been studying machine learning for 4 months, and I understand the concepts behind the MLP. The problem came when I started reading about Convolutional Neural Networks. Let me tell you what I know and then ask what I'm having trouble with.
The core parts of a CNN are:
- Convolutional Layer: you have "n" number of filters that you use to generate "n" feature maps.
- RELU Layer: you use it for normalizing the output of the convolutional layer.
- Sub-sampling Layer: used for "generating" a new feature map that represents more abstract concepts.
Repeat the first 3 layers some times and the last part is a common Classifier, such as a MLP.
My doubts are the following:
- How do I create the filters used in the Convolutional Layer? Do I have to create a filter, train it, and then put it in the Conv Layer, or do I train it with the backpropagation algorithm?
- Imagine I have a conv layer with 3 filters, then it will output 3 feature maps. After applying the RELU and Sub-sampling layer, I will still have 3 feature maps (smaller ones). When passing again through the Conv Layer, how do I calculate the output? Do I have to apply the filter in each feature map separately, or do some kind of operation over the 3 feature maps and then make the sum? I don't have any idea of how to calculate the output of this second Conv Layer, and how many feature maps it will output.
- How do I pass the data from the Conv layers to the MLP (for classification in the last part of the NN)?
If someone knows of a simple implementation of a CNN without using a framework I will appreciate it. I think the best way of learning how stuff works is by doing it by yourself. In another time, when you already know how stuff works, you can use frameworks, because they save you a lot of time.