Reputation: 7723
I am new to ML and running different classification models. What I observe is every time I run the model,I get slightly different results. I learnt online that it's about setting seed value. But I couldn't achieve reproducability?
The below is my code where I tried setting the seed value but it doesn't help
from sklearn.svm import SVC
random.seed(1234)
param_grid = {'C': [0.001,0.01,0.1,1,10,100],
'gamma': [1,0.1,0.01,0.001],
'kernel': ['linear','rbf','poly'],
'class_weight':['balanced']}
svm=SVC()
svm_cv=GridSearchCV(svm,param_grid,cv=5)
svm_cv.fit(X_train_std,y_train)
y_pred = svm_cv.predict(X_test_std)
cm = confusion_matrix(y_test, y_pred)
print(cm)
print("Accuracy is ", accuracy_score(y_test, y_pred))
Can you help me understand how to set seed
value so that everytime when I run the above code, I get the same result/accuracy/metric
Upvotes: 1
Views: 5218
Reputation: 7148
Scikit learn uses the numpy random seed. Therefore you should import numpy and set its random seed like this:
import numpy as np
np.random.seed(1234)
(https://www.mikulskibartosz.name/how-to-set-the-global-random_state-in-scikit-learn/)
Upvotes: 2