Reputation: 7840
I have a data set for which I would like build a multiple linear regression model. In order to compare different independent variable I normalize them by their standard deviation. I used sklearn.linear_model
for this. I thought that this normalization would not effect the coefficient of determination, i.e., R2
value of the prediction; Only the parameters of the estimator would be different. I got this expected result while using LinearRegression
, however the results are different when I use ElasticNet
.
I am wondering if my assumption that R2
value is unchanged during normalization is valid or not. If it is not valid, is there another way to achieve what I want with being able to relatively compare the importance of variables?
import numpy as np
from sklearn.linear_model import ElasticNet, LinearRegression
from sklearn import datasets
# Load the data
diabetes = datasets.load_diabetes()
X = diabetes.data
y = diabetes.target
# Standardize data
X1 = X/X.std(0)
regrLinear = LinearRegression(normalize=False)
regrLinear.fit(X,y)
regrLinear.score(X,y)
0.51774942541329372
regrLinear.fit(X1,y)
regrLinear.score(X1,y)
0.51774942541329372
regrLinear = LinearRegression(normalize=True)
regrLinear.fit(X,y)
regrLinear.score(X,y)
0.51774942541329372
regrEN=ElasticNet(normalize=False)
regrEN.fit(X,y)
regrEN.score(X,y)
0.00883477003833
regrEN.fit(X1,y)
regrEN.score(X1,y)
0.48426155538537963
regrEN=ElasticNet(normalize=True)
regrEN.fit(X,y)
regrEN.score(X,y)
0.008834770038326667
Upvotes: 3
Views: 5115
Reputation: 234
regrEN = ElasticNet(normalize=True)
regrEN.fit(X,y)
print regrEN.score(X,y)
0.00883477003833
regrEN.fit(X1,y)
print regrEN.score(X1,y)
0.00883477003833
I get them to be the same. I wonder how your script is running with regr.score; may be it is printing something else from code that you didn't include in your example?
Upvotes: 1