Reputation: 129
I have a question about the precision and recall values in scikit learn. I am using the function SGDClassifier
to classify my data.
To evaluate the performances, I am using the precision and recall function precision_recall_fscore_support
but each time that I run the program I have different values in the precision and recall matrix. How can I have the true values?
My code is:
scalerI = preprocessing.StandardScaler()
X_train = scalerI.fit_transform(InputT)
X_test = scalerI.transform(InputCross)
clf = SGDClassifier(loss="log", penalty="elasticnet",n_iter=70)
y_rbf = clf.fit(X_train,TargetT)
y_hat=clf.predict(X_test)
a= clf.predict_proba(X_test)
p_and_rec=precision_recall_fscore_support(TargetCross,y_hat,beta=1)
Thank you
Upvotes: 1
Views: 1459
Reputation: 394409
From the docs SGDClassifier has a random_state
param that is initialised to None
this is a seed value used for the random number generator. You need to fix this value so the results are repeatable so set random_state=0
or whatever favourite number you want
clf = SGDClassifier(loss="log", penalty="elasticnet",n_iter=70, random_state=0)
should produce the same results for each run
From the docs:
random_state : int seed, RandomState instance, or None (default) The seed of the pseudo random number generator to use when shuffling the data.
Upvotes: 2