Could someone please illustrate the underlying approach of sklearn.mean_absolute_error for 2 matrices?

Question

The mean_absolute_error function computes mean absolute error, a risk metric corresponding to the expected value of the absolute error loss or L1-norm loss.

I understand the process for 2 "vectors"

>>> from sklearn.metrics import mean_absolute_error
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 7]
>>> mean_absolute_error(y_true, y_pred)

add all the absolute difference between counterpart elements in each vector, and then divided by the length of the vector.

this code is exactly underlying processing of sklearn.mean_absolute_error for 2 "vectors"

res = 0
for t,p in zip(y_true, y_pred):
    res = res + np.abs(t-p)
res/4

what I cannot understand is the approach to matrices

>>> y_true = [[0.5, 1], [-1, 1], [7, -6]]
>>> y_pred = [[0, 2], [-1, 2], [8, -5]]
>>> mean_absolute_error(y_true, y_pred)
0.75

It is obvious not this procedure.

>>> res = 0
>>> for t,p in zip(y_true, y_pred):
... res = res + np.abs(t[0]-p[0]) + np.abs(t[1]-p[1])
>>> res/4
1.125

Could someone please illustrate the underlying approach of sklearn.mean_absolute_error for 2 matrices?

xdurch0 · Accepted Answer

With matrices as input, the total loss is simply divided by the total number of elements. In your example, the total loss is 4.5 (0.5 + 1 + 0 + 1 + 1 + 1) and we have six elements (three times two), so the output of absolute error is 4.5/6 = 0.75 as expected.

Could someone please illustrate the underlying approach of sklearn.mean_absolute_error for 2 matrices?

Answers (1)

Related Questions