Reputation: 734
I'm using the following custom metrics for Keras:
def mcor(y_true, y_pred):
#matthews_correlation
y_pred_pos = K.round(K.clip(y_pred, 0, 1))
y_pred_neg = 1 - y_pred_pos
y_pos = K.round(K.clip(y_true, 0, 1))
y_neg = 1 - y_pos
tp = K.sum(y_pos * y_pred_pos)
tn = K.sum(y_neg * y_pred_neg)
fp = K.sum(y_neg * y_pred_pos)
fn = K.sum(y_pos * y_pred_neg)
numerator = (tp * tn - fp * fn)
denominator = K.sqrt((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn))
return numerator / (denominator + K.epsilon())
def precision(y_true, y_pred):
"""Precision metric.
Only computes a batch-wise average of precision.
Computes the precision, a metric for multi-label classification of
how many selected items are relevant.
"""
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives + K.epsilon())
return precision
def recall(y_true, y_pred):
"""Recall metric.
Only computes a batch-wise average of recall.
Computes the recall, a metric for multi-label classification of
how many relevant items are selected.
"""
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
def f1(y_true, y_pred):
def recall(y_true, y_pred):
"""Recall metric.
Only computes a batch-wise average of recall.
Computes the recall, a metric for multi-label classification of
how many relevant items are selected.
"""
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall = true_positives / (possible_positives + K.epsilon())
return recall
def precision(y_true, y_pred):
"""Precision metric.
Only computes a batch-wise average of precision.
Computes the precision, a metric for multi-label classification of
how many selected items are relevant.
"""
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision = true_positives / (predicted_positives + K.epsilon())
return precision
precision = precision(y_true, y_pred)
recall = recall(y_true, y_pred)
return 2*((precision*recall)/(precision+recall+K.epsilon()))
This is the compilation statement:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy', precision, recall, f1])
Using ModelCheckpoint, the Keras model is saved automatically as the best model is found. The classification categories have been one-hot encoded.
However, when the saved model is loaded back using:
# load model
from keras.models import load_model
custom_obj = {'accuracy':accuracy, 'Loss':Loss, 'precision':precision, 'recall':recall, 'f1':f1}
model = load_model('Asset_3_best_model.h5', custom_objects=custom_obj)
Custom objects from the previously defined custom Keras functions are listed here.
I observe the following error when the model is loaded back from memory:
ValueError: ('Could not interpret metric function identifier:', 0.8701059222221375)
I've tried many different custom functions, but I couldn't find a solution to re-load my saved model. This is a multi classification time series challenge and I hope to learn if there is an easier method to solve this metric calculation.
Upvotes: 1
Views: 2698
Reputation: 345
I'm also working on finding a way to calculate F1 score for my binary classification problem. I came across a tutorial of TensorFlow and it worked for me: https://www.tensorflow.org/tutorials/structured_data/imbalanced_data
Although, it is not custom but direct implementation.
METRICS = [
keras.metrics.TruePositives(name='tp'),
keras.metrics.FalsePositives(name='fp'),
keras.metrics.TrueNegatives(name='tn'),
keras.metrics.FalseNegatives(name='fn'),
keras.metrics.BinaryAccuracy(name='accuracy'),
keras.metrics.Precision(name='precision'),
keras.metrics.Recall(name='recall'),
keras.metrics.AUC(name='auc'),
]
After this, you will have to add a parameter in compile function:
model.compile(...,metrics=METRICS)
I commented tf, fp, tn, fn for my code and got below output:
Train on 2207 samples, validate on 552 samples
Epoch 1/6
- 7s - loss: 1.2502 - accuracy: 0.6357 - precision: 0.4252 - recall: 0.0688 - auc: 0.5138 - val_loss: 0.6229 - val_accuracy: 0.6667 - val_precision: 0.8000 - val_recall: 0.0214 - val_auc: 0.6800
Epoch 2/6
- 7s - loss: 0.6451 - accuracy: 0.6461 - precision: 0.7500 - recall: 0.0076 - auc: 0.5735 - val_loss: 0.6368 - val_accuracy: 0.6685 - val_precision: 0.8333 - val_recall: 0.0267 - val_auc: 0.7144
...
Check if this fixes your problem. If I have missed anything, please let me know.
Upvotes: 2