Pass a scoring function from sklearn.metrics to GridSearchCV

Question

GridSearchCV's documentations states that I can pass a scoring function.

scoring : string, callable or None, default=None

I would like to use a native accuracy_score as a scoring function.

So here is my attempt. Imports and some data:

import numpy as np
from sklearn.cross_validation import KFold, cross_val_score
from sklearn.grid_search import GridSearchCV
from sklearn.metrics import accuracy_score
from sklearn import neighbors

X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
Y = np.array([0, 1, 0, 0, 0, 1])

Now when I use just k-fold cross-validation without my scoring function, everything works as intended:

parameters = {
    'n_neighbors': [2, 3, 4],
    'weights':['uniform', 'distance'],
    'p': [1, 2, 3]
}
model = neighbors.KNeighborsClassifier()
k_fold = KFold(len(Y), n_folds=6, shuffle=True, random_state=0)
clf = GridSearchCV(model, parameters, cv=k_fold)  # TODO will change
clf.fit(X, Y)

print clf.best_score_

But when I change the line to

clf = GridSearchCV(model, parameters, cv=k_fold, scoring=accuracy_score) # or accuracy_score()

I get the error: ValueError: Cannot have number of folds n_folds=10 greater than the number of samples: 6. which in my opinion does not represent the real problem.

In my opinion the problem is that accuracy_score does not follow the signature scorer(estimator, X, y), which is written in the documentation

So how can I fix this problem?

maxymoo · Accepted Answer

It will work if you change scoring=accuracy_score to scoring='accuracy' (see the documentation for the full list of scorers you can use by name in this way.)

In theory, you should be able to pass custom scoring functions like you're trying, but my guess is that you're right and accuracy_score doesn't have the right API.

Pass a scoring function from sklearn.metrics to GridSearchCV

Answers (2)

Related Questions