Scikit learn SGDClassifier: precision and recall change the values each time

Question

I have a question about the precision and recall values in scikit learn. I am using the function SGDClassifier to classify my data. To evaluate the performances, I am using the precision and recall function precision_recall_fscore_support but each time that I run the program I have different values in the precision and recall matrix. How can I have the true values? My code is:

scalerI = preprocessing.StandardScaler()
X_train = scalerI.fit_transform(InputT)
X_test = scalerI.transform(InputCross)
clf = SGDClassifier(loss="log", penalty="elasticnet",n_iter=70)
y_rbf = clf.fit(X_train,TargetT)
y_hat=clf.predict(X_test)
a= clf.predict_proba(X_test)
p_and_rec=precision_recall_fscore_support(TargetCross,y_hat,beta=1)

Thank you

EdChum · Accepted Answer

From the docs SGDClassifier has a random_state param that is initialised to None this is a seed value used for the random number generator. You need to fix this value so the results are repeatable so set random_state=0 or whatever favourite number you want

clf = SGDClassifier(loss="log", penalty="elasticnet",n_iter=70, random_state=0)

should produce the same results for each run

From the docs:

random_state : int seed, RandomState instance, or None (default) The seed of the pseudo random number generator to use when shuffling the data.

Scikit learn SGDClassifier: precision and recall change the values each time

Answers (1)

Related Questions