alyssaeliyah
alyssaeliyah

Reputation: 2244

Keras Always Output a Constant Value

I've been training a simple Artificial Neural Network using Keras

model = Sequential([
    Dense(32, input_shape=(32,), activation = 'relu'),
    Dense(20, activation='relu'),
    Dense(65, input_shape=(65,), activation='softmax')
])

model.summary()
model.compile(Adam(lr=.001), loss='binary_crossentropy', metrics=['accuracy'])
model.fit(train_samples, train_labels, batch_size=1000, epochs=10000,shuffle = True, verbose=2)

After training, I test the model, and it always outputs a constant value. What shall I do?

https://github.com/keras-team/keras/issues/1727

The link above tells that I have to center my data to have a zero mean. I don't have any idea to do that.

Upvotes: 1

Views: 963

Answers (2)

Daniel Möller
Daniel Möller

Reputation: 86600

You're using 'relu' activations without caring for it going to all zeros.

Relu has a "zero region" when its input is negative, and this region, naturally has no gradient, so there is no possibility of it changing over training.

If all neurons in a layer go to the zero region, your model is frozen forever.

One thing you can do is to replace 'relu' by 'sigmoid' or 'tanh'.
Another thing is to use a BatchNormalization layer before it.

The BatchNormalization does the centering for you, besides adding some speed to training and a little regularization.

model = Sequential([

    #optional BatchNormalization(input_shape(32,))

    Dense(32, input_shape=(32,)),
    BatchNormalization(),
    Activation('relu')
    Dense(20),
    BatchNormalization(),
    Activation('relu'),
    Dense(65, input_shape=(65,), activation='softmax')
])

Upvotes: 2

Mitiku
Mitiku

Reputation: 5412

If you want to center your data to have zero mean, just subtract sample mean from each samples.

e.g if you have features data and want to center it to have zero mean, you can do that using numpy's mean function as follows.

features = features - np.mean(features)

You may also need to normalize using standard deviation. This can be achieved by using numpy as follows.

normalized_features = (features - np.mean(features))/ np.std(features)

I hope this helps.

Upvotes: 2

Related Questions