Reputation: 40909
I am familiar with machine learning, but I am learning Tensorflow on my own by reading some slides from universities. Below I'm setting up the loss function for linear regression with only one feature. I'm adding an L2 loss to the total loss, but I am not sure if I'm doing it correctly:
# Regularization
reg_strength = 0.01
# Create the loss function.
with tf.variable_scope("linear-regression"):
W = tf.get_variable("W", shape=(1, 1), initializer=tf.contrib.layers.xavier_initializer())
b = tf.get_variable("b", shape=(1,), initializer=tf.constant_initializer(0.0))
yhat = tf.matmul(X, W) + b
error_loss = tf.reduce_sum(((y - yhat)**2)/number_of_examples)
#reg_loss = reg_strength * tf.nn.l2_loss(W) # reg 1
reg_loss = reg_strength * tf.reduce_sum(W**2) # reg 2
loss = error_loss + reg_loss
# Set up the optimizer.
opt_operation = tf.train.GradientDescentOptimizer(0.001).minimize(loss)
My specific questions are:
I have two lines (commented as reg 1
and reg 2
) that compute the L2 loss of the weight W
. The line marked with reg 1
uses the Tensorflow built-in function. Are these two L2 implementations equivalent?
Am I adding the regularization loss reg_loss
correctly to the final loss function?
Upvotes: 2
Views: 2840
Reputation: 12778
Are these two L2 implementations equivalent?
Almost, as @fabrizioM pointed out, you can see here for the introduction to the l2_loss in TensorFlow docs.
Am I adding the regularization loss reg_loss correctly to the final loss function?
So far so good : )
Upvotes: 1
Reputation: 48330
Almost
According to the L2Loss operation code
output.device(d) = (input.square() * static_cast<T>(0.5)).sum();
It multiplies also for 0.5 (or in other words it divides by 2
)
Upvotes: 2