Reputation: 882
I have a model that learns to classify (binary classification) almost at 100% accuracy after 7-14 epochs but after reaching the minimum loss of 0.0004, in the next epoch it jumps to as much as 7.5 (which means it has a 50% chance of classifying correctly, same has pure chance) and then stays near 7. for all subsequent epochs.
I use adam optimiser which should take care of the learning rate.
How can I prevent the training loss from increasing?
This huge jump doesn't happen for SGD optimiser.
inputs = Input(shape=(X_train.shape[1],))
Dx = Dense(32, activation="relu")(inputs)
Dx = Dense(32, activation="relu")(Dx)
for i in range(20):
Dx = Dense(32, activation="relu")(Dx)
Dx = Dense(1, activation="sigmoid")(Dx)
D = Model(input=[inputs], output=[Dx])
D.compile(loss="binary_crossentropy", optimizer="adam")
D.fit(X_train, y_train, nb_epoch=20)
Upvotes: 4
Views: 5425
Reputation: 1642
Your network is quite deep for a fully connected architecture. Most likely you have been hit by a vanishing- or exploding gradient, i.e. numerical problems caused by multiplying very small or very large numbers repeatedly. I'd recommend a shallower but wider network, with dense layers something like 2-3 layers is often enough in my experience. If you prefer working with the deeper architecture you could try out something like skip connections.
Upvotes: 2
Reputation: 15
This might come from a small batch size. You may try to increase batch size.. referring to this.
Upvotes: 1