Reputation: 427
I'm currently working on my own neural network implementation in Java. I already implemented some common activation functions, like Sigmoid or ReLU, but I don't know how to implement the Softmax.
I want to have a method like
private double softmax(double input) {
double output = ???;
return output;
}
Any ideas how a implementation could look? I also need to have the derivative of the softmax for my learning algorithm.
Upvotes: 2
Views: 3763
Reputation: 29285
Softmax doesn't get a single input value. It takes as input a vector of all values of the current NN layer (by "values" I mean outputs of the previous layer dot product-ed by the kernel matrix and added to the biases), and outputs a probability distribution which all values fall into [0, 1]
range.
So, if your NN layer for example has 5 units/neurons, the softmax function takes as input 5 values and normalizes them into a probability distribution which all 5 output values are between [0, 1]
using the following formula:
where regarding our example: K = 5 and Z1, Z2, ..., Z5 are the input vector.
Here's a sample Java code implementing softmax:
private double softmax(double input, double[] neuronValues) {
double total = Arrays.stream(neuronValues).map(Math::exp).sum();
return Math.exp(input) / total;
}
Upvotes: 5