Reputation: 23
Indeed, even if the values of the activation function were in the values from -10 to 10, this would make the network more flexible, as it seems to me. After all, the problem cannot be only in the absence of a suitable formula. Please explain what I am missing.
Upvotes: 1
Views: 111
Reputation: 117
The activation function of a particular node in a neural network takes the weighted sum of the previous layer.
If this previous layer is a layer with an activation function, then it will just be a weighted sum of nodes and weights that have been offset by the previous activation function on each node. If this activation function is a squashing function, such as the sigmoid, then all of the operands in the weighted sum are squashed to smaller numbers before being added together.
If you only have a couple of nodes in the previous layer, then the number being passed to the current node with an activation function will likely be small. However, if the number of nodes in the previous layer is large, then the number will not necessarily be small.
The input to an activation function in a neural network depends on:
Therefore, the values passed to an activation function can really be anything.
Upvotes: 1