Reputation: 11
I was under the impression that the R-squared value is bound between 0 and 1. However after reading some online literature/forums and also some personal experience, I now feel that the R-squared value (coefficient of determination) is:
a) bound between negative infinity and 1
b) adjusted R-squared value can go beyond 1 (I'm assuming positive infinity?)
Reason for a is that I have fitted some regression models that have returned a negative R-squared value. This is when the best fit line performs worse than the average value of your dependent variable. i.e. the RSS > TSS.
Reason for b is that when I used a PLS regression model on my data I got an R-squared value of about 0.94 (I can't remember the actual figure), but plugging it into the adjusted R-squared equation gave me a value above 1. I attributed this to the fact that there were fewer observations than there were input variables (140 odd observations and 228 input variables.) I have attached the formula of the adjusted R-squared value below.
After a few combinations of p and N I observed that if N > p, the adjusted R-squared does not go beyond 1. However in my case, N < p, the adjusted R-squared does exceed 1.
TL;DR So I would like to know:
Upvotes: 0
Views: 311
Reputation: 11
I had similar question regarding range of adjusted r2_score. I have implemented and came to conclusion that both negative inf and positive infinity values are possible.
def adj_r2(r2,n,k):
return 1 - (((1-r2)*(n-1))/(n-k-1))
# case 1: r2=-inf
adj_r2(r2=-4000,n=100,k=10) # returns -4449.550561797752
# case 2: n<k
adj_r2(r2=-4000,n=5,k=10) # returns 2668.3333333333335
adj_r2(r2=0.8,n=5,k=10)# returns 1.1333333333333333
Upvotes: 0