Reputation: 1732
I read through this page: http://neuralnetworksanddeeplearning.com/chap1.html to better understand how a neural network works. I want to create a simple feed forward network in Java that has no back propagation or training.
What I am fuzzy on is the math involved for each "Neuron" of a layer in a network. Say I have three layers. The first layer takes an input vector of size 100. Does this mean my first layer will have 100 Neurons? Does this also mean that the input for each Neuron will be the sum of all 100 inputs multiplied by weights? Is that sum the input to the activation function of my Neuron?
In the chapter it was mentioned that the sum of all inputs to a neuron/perceptron can be re-stated as the dot product of inputs (x) and weights (w). I can view them as two separate vectors and their dot product gives me x1w1 x2w2 x3w3 .. etc, but how does the sum of x1w1 + x2w2 + .. still equal the dot product?
Finally, if a layer is supposed to have an input of 100 and an output of 1000, does that mean that the layer will actually have 1000 neurons and each neuron takes 100 inputs? So the layer outputs 1 value per neuron, thus giving 1000 outputs?
I apologize in advance if these questions are completely off or trivial, I have read a few documents online and this is my understanding so far, but it is hard to verify without asking someone who really understands it. If you have additional resources or videos that will help, they are much appreciated.
Upvotes: 1
Views: 1174
Reputation: 150
This is my first answer in stackOverflow, so please, go easy.
If i'm understanding your question right, you're wondering how the math behind an artificial neuron works. The neuron is made up of 5 components shown in the following list. (The subscript i indicates the i-th input or weight.)
The artificial neuron has a fairly simple structure.
Using the unit step activation function, you can determine a set of weights (and threshold value) that will produce the following classification: Click to view classification
Looking at number 4. An activation function f. Many different functions can take place with the identity function being the simplest.
The neuron output Y, is the result of applying the activation function to the weighted sum of the inputs, less the threshold.
This value can be discrete or real depending on the activation function used.
Here's an output of Y that holds a specific function F.
Once the output has been calculated, it can be passed to another neuron (or group of neurons) or sampled by the external environment. The interpretation of the neuron output depends upon the problem under consideration
@Seephore
In principle, there is no limit on the number of hidden layers that can be used in an artificial neural network. Such networks can be trained using "stacking" or other techniques from the deep learning literature. Yes, you could have 1000 layers, though I don't know if you'd get much benefit: in deep learning I've more typically seen somewhere between 1-20 hidden layers, rather than 1000 hidden layers. In practice the number of layers is based upon pragmatic concerns, e.g., what will lead to good accuracy with reasonable training time and without overfitting.
What you're asking: Im going to assume you meant to say 100 input and 1000 output? When a input takes in the weighted value, its output gives it to all the other nodes (neurons) in the next layer, but the value is still from the given node.
There are many "wish wash" books out their for java, but if you really want to get into it read This
Upvotes: 1