fabiomaia
fabiomaia

Reputation: 602

What does the kernel_regularizer parameter in a tf.keras.layers.Layer actually implement in terms of the loss function being optimized?

Consider the following model built using the tf.keras API where I used kernel_regularizer=tf.keras.regularizers.l2(l2) at the penultimate layer just before the sigmoid layer for binary classification.

model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(input_shape=(224, 224, 3), filters=32, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),

    tf.keras.layers.Conv2D(filters=64, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(units=512, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(l2)),
    tf.keras.layers.Dense(units=1, activation='sigmoid'),
])
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

What exactly does the kernel_regularizer parameter in a tf.keras.layers.Layer actually implement in terms of the loss function being optimized? Is this just adding the regularization penalty i.e.

enter image description here

to the loss function as is traditionally taught? Is it doing that with regards to all the network's weights or just that layer's?

Upvotes: 0

Views: 5300

Answers (1)

Dr. Snoopy
Dr. Snoopy

Reputation: 56377

Yes, it just adds the regularization penalty to the loss with respect to that layer's weights, you can see that here. This allows you to control which layers are regularized, you can even have different regularization strengths for each layer.

Upvotes: 4

Related Questions