Reputation: 176
I'm trying to figure out how to produce a confusion matrix with cross_validate. I'm able to print out the scores with the code I have so far.
# Instantiating model
model = DecisionTreeClassifier()
#Scores
scoring = {'accuracy' : make_scorer(accuracy_score),
'precision' : make_scorer(precision_score),
'recall' : make_scorer(recall_score),
'f1_score' : make_scorer(f1_score)}
# 10-fold cross validation
scores = cross_validate(model, X, y, cv=10, scoring=scoring)
print("Accuracy (Testing): %0.2f (+/- %0.2f)" % (scores['test_accuracy'].mean(), scores['test_accuracy'].std() * 2))
print("Precision (Testing): %0.2f (+/- %0.2f)" % (scores['test_precision'].mean(), scores['test_precision'].std() * 2))
print("Recall (Testing): %0.2f (+/- %0.2f)" % (scores['test_recall'].mean(), scores['test_recall'].std() * 2))
print("F1-Score (Testing): %0.2f (+/- %0.2f)" % (scores['test_f1_score'].mean(), scores['test_f1_score'].std() * 2))
But I'm trying to get that data into a confusion matrix. I'm able to make a confusion matrix by using cross_val_predict -
y_train_pred = cross_val_predict(model, X, y, cv=10)
confusion_matrix(y, y_train_pred)
Which is great, but since it's doing it's own cross validation, the results won't match up. I'm just looking for a way to produce both with matching results.
Upvotes: 3
Views: 4358
Reputation: 2042
The short answer is you can't.
The idea of Confusion Matrix is evaluate one data using one trained model. And the result is a matrix, not a score like for example accuracy. So you can't calculate the mean or something similar. cross_val_score
as name suggests, works only on scores. Confusion matrix is not a score, it is a kind of summary of what happened during evaluation.
cross_val_predict
is quiet similar on what are you looking for. This function will split data in K parts. Each part will be tested with the model you obtained with the other parts of the data. All the tested sample will be merged. But be careful with this function; from the docs (emphasis added):
Passing these predictions into an evaluation metric may not be a valid way to measure generalization performance. Results can differ from cross_validate and cross_val_score unless all tests sets have equal size and the metric decomposes over samples.
Upvotes: 2
Reputation: 12602
I think the nicest approach would be to define the confusion matrix as a scorer, instead or in addition to the other ones you've defined. Luckily, this is an example in the User Guide; see the third bullet here:
def confusion_matrix_scorer(clf, X, y):
y_pred = clf.predict(X)
cm = confusion_matrix(y, y_pred)
return {'tn': cm[0, 0], 'fp': cm[0, 1],
'fn': cm[1, 0], 'tp': cm[1, 1]}
cv_results = cross_validate(svm, X, y, cv=5,
scoring=confusion_matrix_scorer)
Then cv_results['test_tp']
(etc.) is a list of, for each fold, the number of true positives. Now you can aggregate the confusion matrices however is most appropriate for you.
Another approach came to mind first, and I'll add it here in case it's useful for understanding how sklearn deals with things. But I definitely think the first approach is better.
You can set return_estimator
in cross_validate
, in which case the returned dictionary has a key estimator
with value the list of fitted models. You still need to be able to find the corresponding test folds though. For that, you can define your cv
object manually (e.g. cv = StratifiedKFold(10)
and cross_validate(..., cv=cv)
; then cv
will still contain the relevant data for making the splits. So you can use the fitted estimators to score the appropriate test fold, generating confusion matrices. Or you can use cross_val_predict(..., cv=cv)
, but at that point you repeat the fitting, so you probably should just skip cross_validate
and do the loop yourself.
Upvotes: 6