Softmax Activation Implementation

Question

I'm currently working on my own neural network implementation in Java. I already implemented some common activation functions, like Sigmoid or ReLU, but I don't know how to implement the Softmax.

I want to have a method like

private double softmax(double input) {
    double output = ???;
    return output;
}

Any ideas how a implementation could look? I also need to have the derivative of the softmax for my learning algorithm.

frogatto · Accepted Answer

Softmax doesn't get a single input value. It takes as input a vector of all values of the current NN layer (by "values" I mean outputs of the previous layer dot product-ed by the kernel matrix and added to the biases), and outputs a probability distribution which all values fall into [0, 1] range.

So, if your NN layer for example has 5 units/neurons, the softmax function takes as input 5 values and normalizes them into a probability distribution which all 5 output values are between [0, 1] using the following formula:

where regarding our example: K = 5 and Z₁, Z₂, ..., Z₅ are the input vector.

Here's a sample Java code implementing softmax:

private double softmax(double input, double[] neuronValues) {
    double total = Arrays.stream(neuronValues).map(Math::exp).sum();
    return Math.exp(input) / total;
}

Softmax Activation Implementation

Answers (1)

Related Questions