Reputation: 11
I am following an example from a data science textbook and have run into an issue where I am getting NaN values for the loss when running simple Keras neural networks to find the optimal learning rate.
# Get data and split into test/train/valid and normalize
(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.mnist.load_data()
X_valid, X_train = X_train_full[:5000] / 255., X_train_full[5000:] / 255.
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]
X_test = X_test / 255.
# Callback to grow the learning rate at each iteration.
# Also record learning rate and loss at each iteration.
K = keras.backend
class ExponentialLearningRate(keras.callbacks.Callback):
def __init__(self, factor):
self.factor = factor
self.rates = []
self.losses = []
def on_batch_end(self, batch, logs):
self.rates.append(K.get_value(self.model.optimizer.lr))
self.losses.append(logs["loss"])
K.set_value(self.model.optimizer.lr, self.model.optimizer.lr * self.factor)
# Define the model and compile/fit.
keras.backend.clear_session()
np.random.seed(42)
tf.random.set_seed(42)
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.Dense(300, activation="relu"),
keras.layers.Dense(100, activation="relu"),
keras.layers.Dense(10, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy",
optimizer=keras.optimizers.SGD(lr=1e-3),
metrics=["accuracy"])
expon_lr = ExponentialLearningRate(factor=1.005)
history = model.fit(X_train, y_train, epochs=1,
validation_data=(X_valid, y_valid),
callbacks=[expon_lr])
Running this gives an output of:
1719/1719 [==============================] - 6s 4ms/step - loss: nan - accuracy: 0.6030 - val_loss: nan - val_accuracy: 0.0958
Plotting the loss vs learning rate gives (top is my result, bottom is the expected result from the example I am following):
Notably, the example loss is much noisier than mine and ranges from ~2.5 to ~0.25. My loss only ranges from ~2.5 to exactly 1, at which point the loss goes NaN.
Perhaps something with keras/tf has been updated since this example was written, but as I am new to keras I am wondering what might be the issue here.
Upvotes: 0
Views: 511
Reputation: 772
Your problem is the ExponentialLearningRate, your learning rate go from 0.0010150751 to 5.237502 which is why your loss is exploding, change the optimizer like this
optimizer=tf.keras.optimizers.Adam(0.001)
and remove the callback, your loss will be fine then
Upvotes: 1