Reputation: 51
Each time when I run this code, accuracy comes out different. Can anyone please explain why? Am I missing something here ? Thanks in advance :)
Below is my code:
import scipy
import numpy
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train,y_test = train_test_split(X,y, test_size = .5)
# Use a classifier of K-nearestNeibour
from sklearn.neighbors import KNeighborsClassifier
my_classifier = KNeighborsClassifier()
my_classifier.fit(X_train,y_train)
predictions = my_classifier.predict(X_test)
print(predictions)
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test,predictions))
Upvotes: 2
Views: 64
Reputation: 8187
train_test_split
randomly splits the data into training and test sets, and so you will get different splits each time you run the script. If you want, there's a random_state
parameter that you can set to some number and it will ensure that you get the same split each time you run the script:
X_train, X_test, y_train,y_test = train_test_split(X,y, test_size = .5, random_state = 0)
This should give you an accuracy of 0.96
every time.
Upvotes: 3