What does the kernel_regularizer parameter in a tf.keras.layers.Layer actually implement in terms of the loss function being optimized?

Question

Consider the following model built using the tf.keras API where I used kernel_regularizer=tf.keras.regularizers.l2(l2) at the penultimate layer just before the sigmoid layer for binary classification.

model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(input_shape=(224, 224, 3), filters=32, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),

    tf.keras.layers.Conv2D(filters=64, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(units=512, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(l2)),
    tf.keras.layers.Dense(units=1, activation='sigmoid'),
])
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

What exactly does the kernel_regularizer parameter in a tf.keras.layers.Layer actually implement in terms of the loss function being optimized? Is this just adding the regularization penalty i.e.

to the loss function as is traditionally taught? Is it doing that with regards to all the network's weights or just that layer's?

Dr. Snoopy · Accepted Answer

Yes, it just adds the regularization penalty to the loss with respect to that layer's weights, you can see that here. This allows you to control which layers are regularized, you can even have different regularization strengths for each layer.

What does the kernel_regularizer parameter in a tf.keras.layers.Layer actually implement in terms of the loss function being optimized?

Answers (1)

Related Questions