nithin
nithin

Reputation: 781

What is the difference between knn.score and accuracy metrics in KNN - SK learn

I am concerned about accuracy of my predicted vs test which completely make sense.

X_train , X_test, y_train ,y_test =train_test_split(iris_dataset['data'], iris_dataset['target'], random_state=0)      
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train ,y_train)
y_pred= knn.predict(X_test)
accuracy_score(y_pred,y_test)  # 97 % accuracy here I get accuracy score for pred/test

I tried same thing with knn.score here is the catch document says Returns the mean accuracy on the given test data and labels.

knn.score(X_test,y_test)  # 97% accuracy

My question is why some one should care about this score because X_test ,y_test are the data which I split into train/test-- this is a given data which I am using for Supervised learning what is the point of having score here. Am I completely missing something In case If I check score it should give me 100% right

Upvotes: 3

Views: 21783

Answers (1)

Marcus V.
Marcus V.

Reputation: 6859

The score function is simply a utility function for a default metric to be used within some algorithms of scikit-learn (mostly the algorithms in the model selection module, e.g. GridSearchCV, or cross_validate), if no other metric is specified. So for classification, this is typically accuracy and for regression mean squared error.

So it is the same, as it does exactly what you do in your code: it takes the passed matrix X (e.g. X_test in your case), calls predict and calls accuracy_score. Hence, no surprise that it is the same score. In fact, as scikit-learn is open source, you can just check that yourself here.

Edit:

So how does that concern you? Well, you can similarly use it in algorithms (e.g. if you build ensembles) or just to save a line of code as in your example above. If you were to build your on estimator this is something where you would have to think what is a reasonable default.

Upvotes: 3

Related Questions