bremen_matt
bremen_matt

Reputation: 7339

Keras Dropout layer does not appear to work

I have a relatively straightforward Keras model that I have seen many other people use in the literature. In its simplified form, it looks like this:

model = Sequential()
model.add(Dense(n, activation="relu"))
model.add(Dropout(dropout))
model.add(Dense(m, activation="relu"))
model.add(Dropout(dropout))
model.add(Dense(p))

where n,m,p are some arbitrary dimensions, and dropout is the dropout rate. I train the model like this

                model.compile(loss='mae', optimizer='adam')

                lossHistory = keras.callbacks.History()
                model.fit_generator(train_generator,
                                    steps_per_epoch=steps_per_epoch,
                                    epochs=epochs,
                                    validation_data=valid_generator,
                                    validation_steps=validation_steps,
                                    callbacks = [lossHistory])

Nothing crazy here. The problem is that the dropout parameter seems to have no effect. The reason I say that is that I am getting a lot of overfitting, regardless of the dropout value I use (I have tried 0.1,0.2,..., 0.95).

Therefore, to try to diagnose the problem, I wanted to try the extreme values (dropout = 0 and 1). I might be misunderstanding what the dropout number represents, but one of these values should result in everything getting dropped out, thereby making the model essentially untrainable (since it should return a constant output). However.... with a dropout value of 0 the training looks like:

1/20 [>.............................] - ETA: 139s - loss: 0.9623
2/20 [==>...........................] - ETA: 87s - loss: 0.7758 
3/20 [===>..........................] - ETA: 68s - loss: 0.6146

then with a dropout value of 1, the training looks like:

 1/20 [>.............................] - ETA: 178s - loss: 0.2134
 2/20 [==>...........................] - ETA: 102s - loss: 0.2295
 3/20 [===>..........................] - ETA: 76s - loss: 0.2368 

This should be impossible. What am I missing here guys? Dropout has been very useful for me in my Tensorflow models, but something appears to be wrong with how I am implementing it in Keras...

Just for the record, a snippet of model.summary() returns

dense_1 (Dense)              (None, 50)                550       
_________________________________________________________________
dropout_1 (Dropout)          (None, 50)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 51        

So it does seem to me that the dropout layer is actually in the model (not some dumb bug where I accidentally excluded it from the model).

Upvotes: 0

Views: 1486

Answers (1)

Julio Daniel Reyes
Julio Daniel Reyes

Reputation: 6365

Setting the dropout to 0 or 1 results in ignoring the dropout layer according to the layer definition in the source code.

def call(self, inputs, training=None):
    if 0. < self.rate < 1.:
        noise_shape = self._get_noise_shape(inputs)

        def dropped_inputs():
            return K.dropout(inputs, self.rate, noise_shape,
                             seed=self.seed)
        return K.in_train_phase(dropped_inputs, inputs,
                                training=training)
    return inputs

Upvotes: 2

Related Questions