Reputation: 131
I have seen some learning with convolution neural network code. I do not understand the next part of this code.
loss = tf.reduce_sum(tf.nn.l2_loss(tf.subtract(train_output, train_gt)))
for w in weights:
loss += tf.nn.l2_loss(w)*1e-4
The first line is understandable. It compares the learned result with the label and then represents the square of the difference. And this is the definition of loss. But I do not understand the latter code: for w in weights:
!!
Here w
is a list of 10 weights and biases. So len(w)
is 20(w10 + b10)
. But why does this code calculate the square of w
and multiply it by 1e-4
to add to the loss?
Is it a necessary for the course of learning?
Upvotes: 0
Views: 314
Reputation: 8585
This is the formula that you have:
tf.subtract(train_output, train_gt)
does element-wise subtraction between two tensors train_output
and train_gt
.tf.nn.l2_loss(tf.subtract(train_output, train_gt))
computes the l2-norm of the resulted tensor from (1).tf.reduce_sum(tf.nn.l2_loss(tf.subtract(train_output, train_gt)))
performs reduction sum over all dimensions (e.g. multiple samples in a batch that you have - N
samples in the formula).for w in weights: loss += tf.nn.l2_loss(w)*1e-4
adds l2-regularization term (squared sum of all l2-normalized weights in your model).But why does this code calculate the square of w and multiply by 1e-4 to add to the loss? Is it a necessary course for learning?
It punishes the large values of your weights and limits your solution (in terms of weights) to some bounded region. Is it necessary? Sometimes yes, and sometimes no. There is no short answer. Start with reading this:
Upvotes: 2