Reputation: 2114
I want to use importance sampling when I train the SGDClassifier
. I've seen there is a sample_weight
parameter in the fit
and partial_fit
methods but I am not sure how this parameter works.
Let's say that I have 10 samples and I use an array with 10 weights, will these weights multiply the loss function for the relative sample?
Upvotes: 2
Views: 532
Reputation: 16079
You can find the relevant code in linear_model.sgd_fast, the most pertinent line being:
update *= class_weight * sample_weight
After each update step the final update is simply modified based on any provided sample or class weights provided.
An example of the high level result can be found in the user guide in SGD: Weighted samples
Upvotes: 3