Reputation: 379
I have a problem understanding why my custom RMSE loss function comes out the same value as MAE.
I have a model which I train once with loss='mae' and once with a custom function:
def root_mean_squared_error(y_true, y_pred):
return keras.sqrt(keras.mean(keras.square(y_pred - y_true),axis=-1))
(I know there is no actual benefit in doing the sqrt, but just to get a grasp on things I did it). My problem is that when I use the custom function, the output I get shows that the loss value is equal to the MAE. For example, here is an exemplary output:
/1 [==============================] - 0s 173ms/step - loss: 0.0450 - mean_squared_error: 0.0091 - mean_absolute_error: 0.0450 Epoch 96/100 1/1 [==============================] - 0s 169ms/step - loss: 0.0449 - mean_squared_error: 0.0091 - mean_absolute_error: 0.0449 Epoch 97/100 1/1 [==============================] - 0s 172ms/step - loss: 0.0448 - mean_squared_error: 0.0091 - mean_absolute_error: 0.0448 Epoch 98/100 1/1 [==============================] - 0s 166ms/step - loss: 0.0447 - mean_squared_error: 0.0091 - mean_absolute_error: 0.0447 Epoch 99/100 1/1 [==============================] - 0s 170ms/step - loss: 0.0447 - mean_squared_error: 0.0091 - mean_absolute_error: 0.0447
MAE should not be the same as RMSE. Another strange thing is that I expected RMSE to be the sqrt(MSE), but from the numbers seen above, its not.
Although it doesn't add more info, here is my compile line:
model.compile(optimizer = Adam(lr = 1e-4), loss = root_mean_squared_error, metrics=['mse', 'mae'])
Edit: My training data and target data are monochrome images with 1 channel (so the tensor shape is (None, 256, 256, 1)
Upvotes: 1
Views: 693
Reputation: 379
Found my problem after some debugging. Kota Mori's answer was almost fully correct. Thanks.
My tensor shape is (256, 256, 1) My Mean over axis -1 means I averaged on the LAST dimension only, which only contained a single value so had no effect. The resulting tensor shape was (256,256). As Kota said the sqrt canceled the square and the mean over all data points is automatically calculated by keras.
There is a simple way to calculate RMSE. Here is the corrected code:
def root_mean_squared_error(y_true, y_pred):
return keras.sqrt(keras.mean(keras.square(y_pred - y_true),axis=[-1,-2,-3]))
Upvotes: 0
Reputation: 6750
mean
with axis=-1
calculates the row-wise average. So, if you have only one column, then nothing changes (average of a single value). Putting it to sqrt
then cancels with square
, ended up with abs
.
The loss function in keras seems to define the row-wise loss, then the average is computed internally by keras.
Keras document says (https://keras.io/losses/):
The actual optimized objective is the mean of the output array across all datapoints.
This means there is no easy way to define RMSE on keras, since it cannot be written as an average of row-wise loss.
Upvotes: 1