Reputation: 260
I am working with neural network in my freetime. I developed already an easy XOR-Operation with a neural network. But I dont know when I should use the correct activations function.
Is there an trick or is it just math logic?
Upvotes: 4
Views: 278
Reputation: 767
The subject of when to use a particular activation function over another is a subject of ongoing academic research. You can find papers related to it by searching for journal articles related to "neural network activation function" in an academic database, or through a Google Scholar search, such as this one:
Generally, which function to use depends mostly on what you are trying to do. An activation function is like a lens. You put input into your network, and it comes out changed or focused in some way by the activation function. How your input should be changed depends on what you are trying to achieve. You need to think of your problem, then figure out what function will help you shape your signal into the results you are trying to approximate.
Ask yourself, what is the shape of the data you are trying to model? If it is linear or approximately so, then a linear activation function will suffice. If it is more "step-shaped," you would want to use something like Sigmoid or Tanh (the Tanh function is actually just a scaled Sigmoid), because their graphs exhibit a similar shape. In the case of your XOR problem, we know that a either of those--which work by pushing the output closer to the [-1, 1] range--will work quite well. If you need something that doesn't flatten out away from zero like those two do, the ReLU function might be a good choice (in fact ReLU is probably the most popular activation function these days, and deserves far more serious study than this answer can provide).
You should analyze the graph of each one of these functions and think about the effects each will have on your data. You know the data you will be putting in. When that data goes through the function, what will come out? Will that particular function help you get the output you want? If so, it is a good choice.
Furthermore, if you have a graph of some data with a really interesting shape that corresponds to some other function you know, feel free to use that one and see how it works! Some of ANN design is about understanding, but other parts (at least currently) are intuition.
Upvotes: 0
Reputation: 1
You can solve you problem with a sigmoid neurons in this case the activation function is:
Where:
https://chart.googleapis.com/chart?cht=tx&chl=z%20%3D%20%5Csum_%7Bj%7D%20(w_%7Bj%7Dx_%7Bj%7D%2Bb)
In this formula w there are the weights for each input, b is the bias and x there are the inputs, finally you can use back-propagation for calculate the cost function.
Upvotes: 0
Reputation: 81
There are a lot of options of activation functions such as identity, logistic, tanh, Relu, etc. The choice of the activation function can be based on the gradient computation (back-propagation). E.g. logistic function is always differentiable but it kind of saturate when the input has large value and therefore slows down the speed of optimization. In this case Relu is prefered over logistic. Above is only one simple example for the choise of activation function. It really depends on the actual situation. Besides, I dont think the activation functions used in XOR neural network is representative in more complex application.
Upvotes: 1