Reputation: 51
I have a problem that occurs when I start training my model. This error says that val_loss did not improve from inf and loss: nan. At the beginning I thought it was because of the learning rate but now I'am not sure what it is because I've tried ceveral different learning rates and none of those worked for me. I hope that someone can help me.
My preferences optimizer = adam, learning rate = 0.01 (I've already tried a bunch of different learning rates for example: 0.0005, 0.001, 0.00146,0.005,0.5,0.6,0.7,0.8 but none of these worked for me) EarlyStopping = enabled (Training is stopping because of the EarlyStopping at epoch 3 because there is no improvement. I've also disabled EarlyStopping every time the model stopped the training at epoch 3 and let it make 100 epochs without EarlyStopping enabled.) ReduceLR = disabled
On what I try to train my model I try to train this model on my gpu (EVGA RTX 3080 FTW3 ULTRA)
model = Sequential()
model.add(Conv2D(32,(3,3),padding='same',kernel_initializer='he_normal',input_shape=(img_rows, img_cols,1)))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Conv2D(32,(3,3),padding='same',kernel_initializer='he_normal',input_shape=(img_rows,img_cols,1)))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(64,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Conv2D(64,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(128,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Conv2D(128,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(256,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Conv2D(256,(3,3),padding='same',kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(64,kernel_initializer='he_normal'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(64,kernel_initializer='he_normal'))
model.add(Activation('elu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(num_classes,kernel_initializer='he_normal'))
model.add(Activation('softmax'))
print(model.summary())
from keras.optimizers import RMSprop,SGD,Adam
from keras.callbacks import ModelCheckpoint,EarlyStopping,ReduceLROnPlateau
checkpoint = ModelCheckpoint('Wave.h5',
monitor='val_loss',
mode='min',
save_best_only=True,
verbose=1)
earlystop = EarlyStopping(monitor='val_loss',
min_delta=0,
patience=3,
verbose=1,
restore_best_weights=True)
'''reduce_lr = ReduceLROnPlateau(monitor='val_loss',
factor=0.2,
patience=3,
verbose=1,
min_delta=0.0001)'''
callbacks = [earlystop,checkpoint] #reduce_lr
model.compile(loss='categorical_crossentropy',
optimizer= Adam(lr=0.01),
metrics=['accuracy'])
Upvotes: 2
Views: 5145
Reputation: 393
Found the issue for me who had the same problem. The issue was that I had nan/inf-values in my train/val-set. These values were created for some specific datasets by the df.pct_change()
function.
If you are normalizing your values, make sure that the bounds that are used to normalize the values are not nan/inf.
Upvotes: 2
Reputation: 785
Few Comments...
In these kind of situation, the most preferable is the trial and error approach. It seems like your parameters have diverged while training. Lots of possibilities could be the issue. Also, it seems like you are regularizing your network as well (dropouts, BatchNorm, etc)
Upvotes: 1