Anjith
Anjith

Reputation: 2308

How to use log_loss scorer in gridsearchcv?

Is it possible to use log_loss metric in gridsearchcv?

I have seen few posts where people mentioned about neg_log_loss? Is it same as log_loss? If not is it possible to use log_loss directly in gridsearchcv?

Upvotes: 4

Views: 5572

Answers (1)

Luca Massaron
Luca Massaron

Reputation: 1809

As stated in the documentation, scoring may take different inputs: string, callable, list/tuple, dict or None. If you use strings, you can find a list of possible entries here.

There, as a string representative for log loss, you find "neg_log_loss", i.e. the negative log loss, which is simply the log loss multiplied by -1. This is an easy way to deal with a maximization problem (which is what GridSearchCV expects, because it requires a score parameter, not a loss parameter), instead of a minimization one (you want the minimum log loss, which is equivalente to the maximum negative log loss).

If instead you want to directly pass a log loss function to the GridSearchCV, you just have to create a scorer from the Scikit-learn log_loss function by using make_scorer:

from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import log_loss, make_scorer

iris = datasets.load_iris()
parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
svc = svm.SVC(gamma="scale", probability=True)
LogLoss = make_scorer(log_loss, greater_is_better=False, needs_proba=True)
clf = GridSearchCV(svc, parameters, cv=5, scoring=LogLoss)
clf.fit(iris.data, iris.target)

print(clf.best_score_, clf.best_estimator_)

Upvotes: 9

Related Questions