sklean mean_squared_error ignores the squared argument, with multioutput='raw_values'

Question

The documentation page for the mean squared error function from sklearn provides some examples on how to use the function. Including on how to use it for multioutput data and for calculating the RMSE. The problem is that this does not work when calculating the RMSE on multiple outputs.

Here is the code I used:

from sklearn.metrics import mean_squared_error

y_true = [[0.5, 1],[-1, 1],[7, -6]]
y_pred = [[0, 2],[-1, 2],[8, -5]]

mean_squared_error(y_true, y_pred)  # This returns the MSE
#out: 0.7083333333333334

mean_squared_error(y_true, y_pred, squared=False)  # And the RMSE works too
#out: 0.8416254115301732

mean_squared_error(y_true, y_pred, multioutput='raw_values')  # I can use the MSE for multiple outputs
#out: array([0.41666667, 1.        ])

mean_squared_error(y_true, y_pred, multioutput='raw_values', squared=False)  # But not the RMSE
#out: array([0.41666667, 1.        ])

# However
import numpy as np

np.sqrt(mean_squared_error(y_true, y_pred, multioutput='raw_values'))  # Numpy gives the correct results
#out: array([0.64549722, 1.        ])

Some specifications:

Python 3.6.8 (default, Oct  7 2019, 12:59:55)
[GCC 8.3.0] on linux

sklearn.__version__
'0.22'

np.__version__
'1.17.4'

I looked at the source code but I don't see why this does not work.

Trenton McKinney · Accepted Answer

This is a known, now closed issue, that does not occur in the current version of sklearn 0.23.2, as of this answer.
This is not reproducible in numpy 1.19.1 and sklearn 0.23.2
mean_squared_error(y_true, y_pred, multioutput='raw_values', squared=False) and np.sqrt(mean_squared_error(y_true, y_pred, multioutput='raw_values')) return the same value.
The resolution is to upgrade.
If upgrading is not an option:
- On line: https://github.com/scikit-learn/scikit-learn/blob/b194674c4/sklearn/metrics/_regression.py#L258 replace the following:
- return output_errors → return output_errors if squared else np.sqrt(output_errors)

sklean mean_squared_error ignores the squared argument, with multioutput='raw_values'

Answers (1)

Related Questions

sklean mean_squared_error ignores the squared argument, with multioutput=&#39;raw_values&#39;

Answers (1)

Related Questions

sklean mean_squared_error ignores the squared argument, with multioutput='raw_values'