Turkdogan Tasdelen
Turkdogan Tasdelen

Reputation: 898

Error calculation in backpropagation (gradient descent)

Can someone please give an explanation about the calculation of the error in backpropagation which is found in many code examples such as:

error=calculated-target
// then calculate error with respect to each parameter...

Is this same for squared error and cross entropy error? How?

Thanks...

Upvotes: 0

Views: 523

Answers (1)

Ash
Ash

Reputation: 4728

I will note x an example from the training set, f(x) the prediction of your network for this particular example, and g_x the ground truth (label) associated to x.

The short answer is, the root means squared error (RMS) is used when you have a network that can exactly, and differentiably, predict the labels that you want. The cross-entropy error is used when your network predicts scores for a set of discrete labels.

To clarify, you usually use Root Mean Squared (RMS) when you want to predict values that can change continuously. Imagine you want your network to predict vectors in R^n. This is the case when, for example, you want to predict surface normals or optical flow. Then, these values can changes continuously, and ||f(x)-g_x|| is differentiable. You can use backprop and train your network.

Crosss-entropy, on the other hand, is useful in classification with n labels, for example, in image classification. In that case, the g_x take the discrete values c_1,c_2,...,c_m where m is the number of classes. Now, you can not use RMS because if you assume that your netwrok predicts the exact labels (i.e. f(x) in {c_1,...,c_m}), then ||f(x)-g_x|| is no longer differentiable, and you can not use back-propagation. So, you make a network that does not compute class labels directly, but instead computes a set of scores s_1,...,s_m for each class label. Then, you maximize the probability of the correct score, by maximizing a softmax function on the scores. This makes the loss function differentiable.

Upvotes: 1

Related Questions