Accuracy score in K-nearest Neighbour Classifier not matching with GridSearchCV

Question

I'm learning Machine Learning and I'm facing a mismatch I can't explain.

I have a grid to compute the best model, according to the accuracy returned by GridSearchCV.

model=sklearn.neighbors.KNeighborsClassifier()
n_neighbors=[3, 4, 5, 6, 7, 8, 9]
weights=['uniform','distance']
algorithm=['auto','ball_tree','kd_tree','brute']
leaf_size=[20,30,40,50]
p=[1]

param_grid = dict(n_neighbors=n_neighbors, weights=weights, algorithm=algorithm, leaf_size=leaf_size, p=p)
grid = sklearn.model_selection.GridSearchCV(estimator=model, param_grid=param_grid, cv = 5, n_jobs=1)
SGDgrid = grid.fit(data1, targetd_simp['VALUES'])
print("SGD Classifier: ")
print("Best: ")
print(SGDgrid.best_score_)
value=SGDgrid.best_score_
print("params:")
print(SGDgrid.best_params_)
print("Best estimator:")
print(SGDgrid.best_estimator_)

y_pred_train=SGDgrid.best_estimator_.predict(data1)
print(sklearn.metrics.confusion_matrix(targetd_simp['VALUES'],y_pred_train))
print(sklearn.metrics.accuracy_score(targetd_simp['VALUES'],y_pred_train))

The results I get are the following:

SGD Classifier:
Best:
0.38694539229180525
params:
{'algorithm': 'auto', 'leaf_size': 20, 'n_neighbors': 8, 'p': 1, 'weights': 'distance'}
Best estimator:
KNeighborsClassifier(leaf_size=20, n_neighbors=8, p=1, weights='distance')
[[4962    0    0]
 [   0 4802    0]
 [   0    0 4853]]
1.0

Probably this model is highly overfitted. I still to check it, but it's not the matter of question here.

So, basically, if I understand correctly, GridSearchCV is finding a best accuracy score of 0.3869 (quite poor) for one of the chunks in the cross validation, but the final confusion matrix is perfect, as well as the accuracy of this final matrix. It doesn't make much sense for me... How such a in theory, bad model is performing so well?

I also added scoring = 'accuracy' in GridSearchCV to be sure that the returned value is actually accuracy, and it returns exactly the same value.

What am I missing here?

Accuracy score in K-nearest Neighbour Classifier not matching with GridSearchCV

Answers (1)

Related Questions