Michelle
Michelle

Reputation: 263

Keras model.evaluate()

I have implemented a neural network using Keras and now I would like to try different combinations of input features and conduct hyperparameter tuning. So far I am using MSE as a loss and MAE as a metric. My code looks like this:

#Create the model
model = Sequential()

#Add first hidden layer
model.add(Dense(units=10, input_dim=n_features, activation='relu')) 

#Add output layer
model.add(Dense(units=1, activation='sigmoid'))

model.summary()

model.compile(optimizer='adam', loss='mse', metrics=['mae'])

history = model.fit(x_train, y_train, batch_size=32, epochs=100, validation_data=(x_val, y_val))

result = pd.DataFrame(history.history)
result.head(5)

which gives me the training loss and mae, as well as validation loss and mae after every epoch.

Now, when comparing different networks, I would like to take a single loss value to compare. I am not sure if I can just use model.evaluate() and what it does.

#store and print validation loss and validation mae
val_loss = model.evaluate(x_val, y_val, verbose=0)[0]
val_mae = model.evaluate(x_val, y_val, verbose=0)[1]
print('validation loss (MSE):', val_loss, '\nvalidation MAE:', val_mae)

The output that I get from model.evaluate() is not the same as if I would take the minimum loss or MAE for all 100 epochs. So what does model.evaluate() do?

Upvotes: 0

Views: 7710

Answers (2)

Carl Kristensen
Carl Kristensen

Reputation: 481

When you are using sigmoid as the last activation function then MSE is not the correct loss function.

I would suggest 'binary_crossentrophy' or either do not use any activation function in your last layer and use 'mse' as your loss function.

-MULTI LABEL CLASSIFICATION
activation: sigmoid 
loss: binary_crossentrophy

-MULTI CLASS CLASSIFICATION
activation: softmax
loss: categorical_crossentrophy

REGRESSION (-inf, +inf)
Activation: None
loss: mse

In both class and label classification you are checking how many times your prediction was correct while when using the MSE you are checking the distance from the actual correct value.

So it is a different type of measuring errors.

Upvotes: 0

Ayush Goel
Ayush Goel

Reputation: 375

When you train the model, keras records the loss after every epoch (iteration of the dataset). It is quite possible that during training, your model finds a good minima (say at epoch 50), but then jumps to another minima later (at epoch 99) which is slightly inferior and stops training there.

Taking the minimum loss for all 100 epochs would give you the loss if the NNs parameters (not hyperparameters) were equal to what it was in epoch 50.

model.evaluate() just takes your neural network as it is (at epoch 100), computes predictions, and then calculates the loss.

Thus, the minimum loss is likely to be less (although only slightly for good hyperparameters), than the model.evaluate(), but model.evaluate() tells you where your NN is currently.

Upvotes: 2

Related Questions