sushmita P
sushmita P

Reputation: 1

Getting completely different weight values when using sklearn.linear_model.SGDClassifier with different random_state value for Logistic Regression

I believe, the weight should change slightly with different random state. What could be the reason for getting different weights at every run with random_state = None

Following are the weights value for few runs( contains 3 features) 1)4.67100318,1.26129186,17.26554955 2)3.39793468,2.10265234,18.42484435 3)-2.08082186,1.25948975,10.37120852 4)3.71122156,0.93510126,16.63007864

Because of this fluctuations, I am not sure which random_state should I use and this is creating trouble while performing feature selection. Please note that I am using data after performing standardisation.

I am using very simple code as below to train my model, as my data contain only 200 rows of data with 3 features

from sklearn.linear_model import SGDClassifier
SGDClf = SGDClassifier(loss='log',random_state=1)
SGDClf.fit(X,Y)

Upvotes: 0

Views: 314

Answers (1)

Alex Ricciardi
Alex Ricciardi

Reputation: 431

Machine learning models will produce different results on same dataset, random_state = None,
the models generate a sequence of random numbers called random seed used within the process of generating test, validation and training datasets from a given dataset, ex:random_state = 1.
Configurating a model's seed to a set value will ensure that the (weight) results are reproducible.

SGDClassifier() shuffles the entered data:

The passed (random state) value will have an effect on the reproducibility of the results returned by the function (fit, split, or any other function like k_means). - random state doc

Hope it is helpful

Upvotes: 1

Related Questions