Reputation: 43
I'm training Machine Learning models on Python and using R squared metric from Scikit Learn to evaluate them. Id decided to play around with Scikit's r2_score function, feeding it a random array of same value as input y_true and and slightly different but same value array as y_predict. I was getting arbitrarily large (negative) values when the input length of array is 10 or more and 0 when the input length is less than 10.
from sklearn.metrics import r2_score
r2_score([213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667,
213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667],
[213, 214, 214, 214, 214, 214, 214, 214, 214, 214])
>>> -1.1175847590636849e+26
r2_score([213.91666667, 213.91666667, 213.91666667, 213.91666667,
213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667],
[213, 214, 214, 214, 214, 214, 214, 214, 214])
>>> 0
Upvotes: 4
Views: 4068
Reputation: 520
You're correct in noting that the r2_score
output is not correct. However, this is a result of a simpler computation issue rather than a problem with the scikit-learn package.
Try running
>>> input_list = [213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667,
213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667]
>>> sum(input_list)/len(input_list)
As you can see, the output is not exactly 213.91666667 (a limited precision error; you can read more about it here). Why does this matter?
Well, the section of the scikit-learn User Guide gives the specific formula used to calculate r2_score
:
As you can see, the r2_score
is simply 1 - (residual sum of squares)/(total sum of squares).
In the first case you specify, the residual sum of squares is equal to some number that...doesn't really matter. You can calculate it easily; it's about 0.09, which doesn't seem super high. However, due to the floating point error described above, the total sum of squares isn't exactly 0, but rather some very, very small number (think around 10^-28 -- very small).
Thus, when you divide residual sum of squares (around 0.09) by total sum of squares (a very small number), you're left with a very large number. Since that large number is subtracted from 1, you are left with a negative number of high magnitude as your r2_score
output.
This imprecision in the calculation of total sum of squares does not occur in the second case, so the denominator is 0 and the function, seeing an undefined value from of the calculations, should return 0.
Upvotes: 5
Reputation: 36619
Looking at the source code of r2_score, we can see the following lines (default weights assigned)
weight = 1
sample_weight = None
y_true = np.array([213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667, 213.91666667]).reshape(-1,1)
y_pred = np.array([213, 214, 214, 214, 214, 214, 214, 214, 214, 214]).reshape(-1,1)
numerator = (weight * (y_true - y_pred) ** 2).sum(axis=0,
dtype=np.float64)
denominator = (weight * (y_true - np.average(
y_true, axis=0, weights=sample_weight)) ** 2).sum(axis=0,
dtype=np.float64)
nonzero_denominator = denominator != 0
nonzero_numerator = numerator != 0
valid_score = nonzero_denominator & nonzero_numerator
output_scores = np.ones([y_true.shape[1]])
output_scores[valid_score] = 1 - (numerator[valid_score] /
denominator[valid_score])
return np.average(output_scores, weights=None)
The problematic line in your case is the denominator
calculation.
For the first case:
denominator = (weight * (y_true - np.average(
y_true, axis=0, weights=sample_weight)) ** 2).sum(axis=0,
dtype=np.float64)
print(denominator)
[ 8.07793567e-27]
Its pretty small, but not 0.
For second case: its 0.
Since the denominator is 0, the r2_score is undefined and returns 0. Hope I'm clear.
Upvotes: 1
Reputation: 7506
This has not to do with scikit learn, but with the concept of R^2 itself. Intuitively, you can think at it as the ratio of variance of Y explained by your explanatory variables X.
Here X variance is zero (you always repeat the same number), hence the R^2 being zero (when the two vectors have same length).
If the two vectors have different length...well the regression itself is not well defined. I guess it would be better if the function returned error.
Upvotes: 0