What is class_weight parameter does in scikit-learn SGD

Question

I am a frequent user of scikit-learn, I want some insights about the “class_ weight ” parameter with SGD.

I was able to figure out till the function call

plain_sgd(coef, intercept, est.loss_function,
                 penalty_type, alpha, C, est.l1_ratio,
                 dataset, n_iter, int(est.fit_intercept),
                 int(est.verbose), int(est.shuffle), est.random_state,
                 pos_weight, neg_weight,
                 learning_rate_type, est.eta0,
                 est.power_t, est.t_, intercept_decay)

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/stochastic_gradient.py

After this it goes to sgd_fast and I am not very good with cpython. Can you give some celerity on these questions.

I am having a class biased in the dev set where positive class is somewhere 15k and negative class is 36k. does the class_weight will resolve this problem. Or doing undersampling will be a better idea. I am getting better numbers but it’s hard to explain.
If yes then how it actually does it. I mean is it applied on the features penalization or is it a weight to the optimization function. How I can explain this to layman ?

What is class_weight parameter does in scikit-learn SGD

Answers (1)

Related Questions