Georg Heiler
Georg Heiler

Reputation: 17724

sklearn custom scorer multiple metrics at once

I have a function which returns an Observation object with multiple scorers How can I integrate it into a custom sklearn scorer? I defined it as:

class Observation():
    def __init__(self):
        self.statValues = {}
        self.modelName = ""

    def setModelName(self, nameOfModel):
        self.modelName = nameOfModel

    def addStatMetric(self, metricName,metricValue):
        self.statValues[metricName] = metricValue

A custom score is defined like:

def myAllScore(y_true, y_predicted):
    return Observation
my_scorer = make_scorer(myAllScore)

which could look like

{   'AUC_R': 0.6892943119440752,
    'Accuracy': 0.9815382629183745,
    'Error rate': 0.018461737081625407,
    'False negative rate': 0.6211453744493393,
    'False positive rate': 0.0002660016625103907,
    'Lift value': 33.346741089307166,
    'Precision J': 0.9772727272727273,
    'Precision N': 0.9815872808592603,
    'Rate of negative predictions': 0.0293063938288739,
    'Rate of positive predictions': 0.011361068973307943,
    'Sensitivity (true positives rate)': 0.3788546255506608,
    'Specificity (true negatives rate)': 0.9997339983374897,
    'f1_R': 0.9905775376404309,
    'kappa': 0.5384745595159575}

Upvotes: 2

Views: 3171

Answers (2)

Dror
Dror

Reputation: 13081

As a matter of fact it is possible, as described in this fork: multiscorer.

For the sake of completeness, here's an example:

from multiscorer.multiscorer import MultiScorer

#Scikit's libraries for demonstration
from sklearn.metrics import accuracy_score, precision_score
from sklearn.model_selection import cross_val_score
from numpy import average

scorer = MultiScorer({
  'accuracy': (accuracy_score, {}),
  'precision': (precision_score, {'average': 'macro'})
})

...

cross_val_score(clf, X, target, scoring=scorer )

results = scorer.get_results()

for metric in results.keys():
  print("%s: %.3f" % (metric, average(results[metric])))

Upvotes: 4

lejlot
lejlot

Reputation: 66825

In short: you cannot.

Long version: scorer has to return a single scalar, since it is something that can be used for model selection, and in general - comparing objects. Since there is no such thing as a complete ordering over vector spaces - you cannot return a vector inside a scorer (or dictionary, but from mathematical perspective it might be seen as a vector). Furthermore, even other use cases, like doing cross validation does not support arbitrary structured objects as a return value since they try to call np.mean over the list of the values, and this operation is not defined for the list of python dictionaries (which your method returns).

The only thing you can do is to create separate scorer for each of the metrics you have, and use them independently.

Upvotes: 5

Related Questions