Reputation: 135
I found in other questions that to do L2 regularization in convolutional networks using tensorflow the standard way is as follow.
For each conv2d layer, set the parameter kernel_regularizer
to be l2_regularizer
like this
regularizer = tf.contrib.layers.l2_regularizer(scale=0.1)
layer2 = tf.layers.conv2d(
inputs,
filters,
kernel_size,
kernel_regularizer=regularizer)
Then in the loss function, collect the reg loss
reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
reg_constant = 0.01 # Choose an appropriate one.
loss = my_normal_loss + reg_constant * sum(reg_losses)
Many people including me made the mistake skipping the 2nd step. That implies the meaning of kernel_regularizer
is not well understood. I have an assumption that I can't confirm. That is
By setting
kernel_regularizer
for a single layer you are telling the network to forward the kernel weights at this layer to the loss function at the end of the network such that later you will have the option (by another piece of code you write) to include them in the final regularization term in the loss function. Nothing more.
Is it correct or is there a better explanation?
Upvotes: 2
Views: 4067
Reputation: 116
For Tensorflow > 2.0
model.losses
as shown in the TF1.x -> TF2.x migration guide
tf.math.add_n(model.losses)
as shown hereUpvotes: 0
Reputation: 27042
Setting a regularizer to a tf.layer.*
means to just keep the layer weights, apply the regularization (that means to just create a node in the computational graph that computes this regularization on this specified set of weights and nothing more) and add this node to the tf.GraphKeys.REGULARIZATION_LOSSES
collection.
After that, is your work to get the elements of this collection and add it to your loss.
In order to do that, you can just use tf.losses.get_regularization_losses
and sum all the returned terms.
In your code there's an error, you shouldn't add an additional multiplicative constant reg_constant * sum(reg_losses)
because this term is already added when you specified the regularization for the layer.
Upvotes: 6