Reputation: 2527
Can someone tell me mathematically how sample_weight and class_weight are used in Keras in the calculation of loss function and metrics? A simple mathematical express will be great.
Upvotes: 14
Views: 3308
Reputation: 7790
It is a simple multiplication. The loss contributed by the sample is magnified by its sample weight. Assuming i = 1 to n
samples, a weight vector of sample weights w
of length n
, and that the loss for sample i
is denoted L_i
:
In Keras in particular, the product of each sample's loss with its weight is divided by the fraction of weights that are not 0 such that the loss per batch is proportional to the number of weight > 0 samples. Let p
be the proportion of non-zero weights.
Here's the relevant snippet of code from the Keras repo:
score_array = loss_fn(y_true, y_pred)
if weights is not None:
score_array *= weights
score_array /= K.mean(K.cast(K.not_equal(weights, 0), K.floatx()))
return K.mean(score_array)
class_weight
is used in the same way as sample_weight
; it is just provided as a convenience to specify certain weights across entire classes.
The sample weights are currently not applied to metrics, only loss.
Upvotes: 16