Reputation: 41
There is data (X, y) on which the search and training in the function GridSearchCV. Training takes place on a custom criterion T_scorer. Is it possible in the T_scorer function to get a trained model to use? I need to "T_scorer" predict data "X1". That is, the model is trained on data (X,y) at each iteration, and is predicted on (X1,y1) Again (X1, y1) do not participate in training at all and "GridSearchCV" these data do not see.
Ideally, we should make the training to take place on the data(X,y), and in "scoring" the results based on predictions (X1,y1) should be transmitted)
def T_scorer(y_true, y_pred, clf, **kwargs):
r = np.sum((y_pred == 0) & (y_pred == y_true))
y_pred1 = clf.predict(X1) #It doesn't work
confmat = confusion_matrix(y, y_pred)
print(confmat)
print(r)
return r
_scorer = make_scorer(T_scorer)
clf = RandomForestClassifier()
grid_searcher = GridSearchCV(clf, parameter_grid, cv=StratifiedKFold(shuffle =True,random_state=42),verbose=20, scoring=_scorer)
grid_searcher.fit(X, y)
clf_best = grid_searcher.best_estimator_
print('Best params = ', clf_best.get_params())
Upvotes: 1
Views: 150
Reputation: 36599
make_scorer()
should be used only when you have a function of the signature (y_true, y_pred)
. When you use the make_scorer()
on your function, the returned signature is:
func(estimator, X, y)
which is then used in the GridSearchCV. So instead of using make_scorer
, you can specify your function as:
# I am assuming this is the data you want to use
X1 = X[:1000]
y1 = y[:1000]
def T_scorer(clf, X, y):
# Here X and y passed on to the function from GridSearchCV will not be used
y_pred1 = clf.predict(X1)
r = np.sum((y_pred1 == 0) & (y1 == y_pred1))
confmat = confusion_matrix(y1, y_pred1)
print(confmat)
return r
# Now dont use make_scorer, pass directly
grid_searcher = GridSearchCV(clf,..., verbose=20, scoring=T_scorer)
Upvotes: 2