Custom scikit-learn scorer can't access mean after fit

Question

I am trying to create a custom estimator based on scikit learn. I have written the below dummy code to explain my problem. In the score method, I am trying to access mean_ calulated in fit. But I am unable to. What I am doing wrong? I have tried many things and have done this referring three four articles. But didn't find the issue.

I have read the documentation and did few changes. But nothing worked. I have also tried inheriting BaseEstimator, ClassifierMixin. But that also didn't work.

This a dummy program. Don't go by what it is trying to do.

import numpy as np
from sklearn.model_selection import cross_val_score


class FilterElems:
    def __init__(self, thres):
        self.thres = thres

    def fit(self, X, y=None, **kwargs):
        self.mean_ = np.mean(X)
        self.std_ = np.std(X)
        return self

    def predict(self, X):
        #         return sign(self.predict(inputs))
        X = (X - self.mean_) / self.std_
        return X[X > self.thres]

    def get_params(self, deep=False):
        return {'thres': self.thres}

    def score(self, *x):
        print(self.mean_)  # errors out, mean_ and std_ are wiped out
        if len(x[1]) > 50:
            return 1.0
        else:
            return 0.5


model = FilterElems(thres=0.5)
print(cross_val_score(model,
                      np.random.randint(1, 1000, (100, 100)),
                      None,
                      scoring=model.score,
                      cv=5))

Err:

AttributeError: 'FilterElems' object has no attribute 'mean_'

mujjiga · Accepted Answer

You are almost there.

The signature for scorer is scorer(estimator, X, y). The cross_val_score calls the scorer method by passing the estimator object as the first parameter. Since your signature of scorer is a variable argument function, the first item will hold the estimator

change your score to

def score(self, *x):
    print(x[0].mean_)
    if len(x[1]) > 50:
        return 1.0
    else:
        return 0.5

Working code

import numpy as np
from sklearn.model_selection import cross_val_score

class FilterElems:
    def __init__(self, thres):
        self.thres = thres

    def fit(self, X, y=None, **kwargs):
        self.mean_ = np.mean(X)
        self.std_ = np.std(X)
        return self

    def predict(self, X):
        X = (X - self.mean_) / self.std_
        return X[X > self.thres]

    def get_params(self, deep=False):
        return {'thres': self.thres}

    def score(self, estimator, *x):
        print(estimator.mean_, estimator.std_) 
        if len(x[0]) > 50:
            return 1.0
        else:
            return 0.5

model = FilterElems(thres=0.5)
print(cross_val_score(model,
                      np.random.randint(1, 1000, (100, 100)),
                      None,
                      scoring=model.score,
                      cv=5))

Outout

504.750125 288.84916035447355
501.7295 289.47825925231416
503.743375 288.8964170227962
503.0325 287.8292687406025
500.041 289.3488678377712
[0.5 0.5 0.5 0.5 0.5]

Custom scikit-learn scorer can't access mean after fit

Answers (2)

Related Questions

Custom scikit-learn scorer can&#39;t access mean after fit

Answers (2)

Related Questions

Custom scikit-learn scorer can't access mean after fit