'Reduction' parameter in tf.keras.losses

Question

According to the docs, the Reduction parameter takes on 3 values - SUM_OVER_BATCH_SIZE, SUM and NONE.

y_true = [[0., 2.], [0., 0.]]
y_pred = [[3., 1.], [2., 5.]]

mae = tf.keras.losses.MeanAbsoluteError(reduction=tf.keras.losses.Reduction.SUM)
mae(y_true, y_pred).numpy()
> 5.5

mae = tf.keras.losses.MeanAbsoluteError()
mae(y_true, y_pred).numpy()
> 2.75

What I could infer about the calculation after various trials, is this:-

when REDUCTION = SUM,

Loss = Sum over all samples {(Sum of differences between y_pred and y_target vector of each sample / No of element in y_target of the sample )} = { (abs(3-0) + abs(1-2))/2 } + { (abs(2-0) + abs(5-0))/2 } = {4/2} + {7/2} = 5.5.
when REDUCTION = SUM_OVER_BATCH_SIZE,

Loss = [Sum over all samples {(Sum of differences between y_pred and y_target vector of each sample / No of element in y_target of the sample )}] / Batch_size or No of Samples = [ { (abs(3-0)} + abs(1-2))/2 } + { (abs(2-0) + abs(5-0))/2 } ]/2 = [ {4/2} + {7/2} ]/2 = [5.5]/2 = 2.75.

As a result, SUM_OVER_BATCH_SIZE is nothing but SUM/batch_size. Then, why is it called SUM_OVER_BATCH_SIZE when SUM actually adds up the losses over the entire batch, while SUM_OVER_BATCH_SIZE calculates the average loss of the batch.

Is my assumption regarding the workings of SUM_OVER_BATCH_SIZE and SUM at all correct?

CristoJV · Accepted Answer

Your assumption is correct as far as I understand.

If you check the github [keras/losses_utils.py][1] lines 260-269 you will see that it does performs as expected. SUM will sum up the losses in the batch dimension, and SUM_OVER_BATCH_SIZE would divide SUM by the number of total losses (batch size).

def reduce_weighted_loss(weighted_losses,
                     reduction=ReductionV2.SUM_OVER_BATCH_SIZE):
  if reduction == ReductionV2.NONE:
     loss = weighted_losses
  else:
     loss = tf.reduce_sum(weighted_losses)
     if reduction == ReductionV2.SUM_OVER_BATCH_SIZE:
        loss = _safe_mean(loss, _num_elements(weighted_losses))
  return loss

You can do a easy checking with your previous example just by adding one pair of outputs with 0 loss.

y_true = [[0., 2.], [0., 0.],[1.,1.]]
y_pred = [[3., 1.], [2., 5.],[1.,1.]]

mae = tf.keras.losses.MeanAbsoluteError(reduction=tf.keras.losses.Reduction.SUM)
mae(y_true, y_pred).numpy()
> 5.5

mae = tf.keras.losses.MeanAbsoluteError()
mae(y_true, y_pred).numpy()
> 1.8333

So, your assumption is correct. [1]: https://github.com/keras-team/keras/blob/v2.7.0/keras/utils/losses_utils.py#L25-L84

'Reduction' parameter in tf.keras.losses

Answers (1)

Related Questions

&#39;Reduction&#39; parameter in tf.keras.losses

Answers (1)

Related Questions

'Reduction' parameter in tf.keras.losses