Lukas
Lukas

Reputation: 93

The derivative of Softmax outputs really large shapes

I am creating a basic, and also my first neural network on handwritten digit recognition without any framework (like Tensorflow, PyTorch...) using the Backpropagation algorithm.

My NN has 784 inputs and 10 outputs. So for the last layer, I have to use Softmax.

Because of some memory errors, I have right now my images in shape (300, 784) and my labels in shape (300, 10) After that I am calculating loss from Categorical Cross-entropy. Now we are getting to my problem. In Backpropagation, I need manually compute the first derivative of an activation function. I am doing it like this:

dAl = -(np.divide(Y, Al) - np.divide(1 - Y, 1 - Al))
#Y = test labels
#Al - Activation value from my last layer

And after that my Backpropagation can start, so the last layer is softmax.

def SoftmaxDerivative(dA, Z):
        #Z is an output from np.dot(A_prev, W) + b
              #Where A_prev is an activation value from previous layer
              #W is weight and b is bias
        #dA is the derivative of an activation function value
        x = activation_functions.softmax(dA)
        s = x.reshape(-1,1)
        dZ = np.diagflat(s) - np.dot(s, s.T)
        return dZ

1. Is this function working properly?

In the end, I would like to compute derivatives of weights and biases, So I am using this:

dW = (1/m)*np.dot(dZ, A_prev.T)
#m is A_prev.shape[1] -> 10
db = (1/m)*np.sum(dZ, axis = 1, keepdims = True)

BUT it fails on dW, because dZ.shape is (3000, 3000) (compare to A_prev.shape, which is (300,10)) So from this I assume, that there are only 3 possible outcomes.

  1. My Softmax backward is wrong

  2. dW is wrong

  3. I have some other bug completely somewhere else

Any help would be really appreciated!

Upvotes: 0

Views: 1115

Answers (1)

Denzel
Denzel

Reputation: 449

I faced the same problem recently. I'm not sure but maybe this question will help you: Softmax derivative in NumPy approaches 0 (implementation)

Upvotes: 1

Related Questions