Reputation: 41
I am trying to read the following code for back propagation in python
probs = exp_scores /np.sum(exp_scores, axis=1, keepdims=True)
#Backpropagation
delta3 = probs
delta3[range(num_examples), y] -= 1
dW2 = (a1.T).dot(delta3)
....
but I cannot understand the following line of the code:
delta3[range(num_examples), y] -= 1
could you please tell me what does this do?
Thank you very much for your help!
Upvotes: 0
Views: 289
Reputation: 53758
If you're interested, why it's computed this way, it's the backpropagation through cross-entropy loss:
probs
is the vector of class probabilities (computed in a forward pass via softmax).delta3
is the error signal from the loss function.y
holds the ground truth classes for the mini-batch.Everything else is just a math, which is well explained in this post and they end up with the same numpy expression.
Upvotes: 0
Reputation: 6528
There are two things here. First it is using numpy slicing to select only a fraction of delta3
. Secondly it is removing 1 to every element of this fraction of the matrix.
More precisely, delta3[range(num_example), y]
is selecting lines of the matrix delta3
ranging from 0 to num_examples
but only selecting column y
.
Upvotes: 1