Normalization in multiple-linear regression

Question

I have a data set for which I would like build a multiple linear regression model. In order to compare different independent variable I normalize them by their standard deviation. I used sklearn.linear_model for this. I thought that this normalization would not effect the coefficient of determination, i.e., R2 value of the prediction; Only the parameters of the estimator would be different. I got this expected result while using LinearRegression, however the results are different when I use ElasticNet.

I am wondering if my assumption that R2 value is unchanged during normalization is valid or not. If it is not valid, is there another way to achieve what I want with being able to relatively compare the importance of variables?

import numpy as np
from sklearn.linear_model import ElasticNet, LinearRegression
from sklearn import datasets

# Load the data
diabetes = datasets.load_diabetes()
X = diabetes.data
y = diabetes.target
# Standardize data
X1 = X/X.std(0)

regrLinear = LinearRegression(normalize=False)
regrLinear.fit(X,y)

regrLinear.score(X,y)
0.51774942541329372

regrLinear.fit(X1,y)
regrLinear.score(X1,y)
0.51774942541329372

regrLinear = LinearRegression(normalize=True)
regrLinear.fit(X,y)
regrLinear.score(X,y)
0.51774942541329372

regrEN=ElasticNet(normalize=False)    
regrEN.fit(X,y)
regrEN.score(X,y)
0.00883477003833

regrEN.fit(X1,y)
regrEN.score(X1,y)
0.48426155538537963

regrEN=ElasticNet(normalize=True)
regrEN.fit(X,y)
regrEN.score(X,y)
0.008834770038326667

Normalization in multiple-linear regression

Answers (1)

Related Questions