Model is unable to overfit, even with a tiny dataset

Question

I'm trying to create a CNN that can predict the position of up to 5 objects. Each point has 4 values: x and y coordinates, a presence value(0 if it's a padding value, 1 if it exists) and a depth value(0 for close up/short depth, 1 for far away/long depth). No matter what I do the model is unable to learn/overfit. I just ran it for 600 epochs on a dataset of 100 images with no augmentation, dropout, regularization or anything else that could lower overfitting and it was still unable to place points where it was supposed to.

This is my model. I'm using ResNet50 as a base.

class my_model(keras.Model):
    def __init__(self):
        super().__init__()

        self.base_model = ResNet50(include_top=False, input_shape=(224, 224, 3), weights='imagenet')
        self.base_model.trainable = True

        self.global_avg_pool = keras.layers.GlobalAveragePooling2D()

        self.dense1 = keras.layers.Dense(128, kernel_initializer=keras.initializers.HeNormal, kernel_regularizer=l2(0))
        self.batch_norm1 = keras.layers.BatchNormalization()
        self.activation1 = keras.layers.LeakyReLU(alpha=.1)

        self.dropout1 = keras.layers.Dropout(0)
        self.dense2 = keras.layers.Dense(64, kernel_initializer=keras.initializers.HeNormal, kernel_regularizer=l2(0))

        self.batch_norm2 = keras.layers.BatchNormalization()
        self.dropout2 = keras.layers.Dropout(0)

        self.output_layer = keras.layers.Dense(20, activation=None, kernel_regularizer=l2(0))

    @tf.function(reduce_retracing=True)
    def call(self, inputs, training=False):
        x = self.base_model(inputs, training=training)
        x = self.global_avg_pool(x)

        x = self.dense1(x)
        #x = self.batch_norm1(x, training=training)

        x = self.activation1(x)
        x = self.dropout1(x, training=training)

        x = self.dense2(x)
    
        #x = self.batch_norm2(x, training=training)
        x = self.activation1(x)
        x = self.dropout2(x, training=training)

        output = self.output_layer(x)

        x_vals = keras.activations.relu(output[:, ::4])
        y_vals = keras.activations.relu(output[:, 1::4])
        depth_vals = keras.activations.sigmoid(output[:, 2::4])
        presence_vals = keras.activations.sigmoid(output[:, 3::4])

        outputs = keras.layers.Concatenate(axis=-1)([x_vals, y_vals, depth_vals, presence_vals])

        return outputs

And then I compile it:

model = my_model()
model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.1, clipvalue=1.0),
    loss='mean_squared_error',
    metrics=["accuracy"]
)

My guess right now is that it's just stuck in some local minima. I've tried a few different optimizers and they all achieve more or less the same result. Then I thought that maybe the learning rate was the issue, so I tried a bunch of values from 0.0001 to 0.1(past 0.1 values just explode to NaN) and I tried adding cosine annealing so my LR varies.

None of these helped, and it's still stuck(When I say "stuck", I mean that it predicts the exact same thing every time)

Model is unable to overfit, even with a tiny dataset

Answers (0)

Related Questions