Predicting price using regression data model

Question

I built regression data model to predict house price upon several independent variables. And I got regression equation with coefficient. I used StandardScaler()to scale my variables before split the data set. And now I want to predict house price when given new values for independent variables using my regression model for that thing can I directly use values for independent variables and calculate price? or before include values for independent variables should I pass the values through StandardScaler() method??

Tushar Gupta · Accepted Answer

To answer your question, yes you have to process your test input as well but consider the following explanation.

StandardScaler() standardize features by removing the mean and scaling to unit variance

If you fit the scaler on whole dataset and then split, Scaler would consider all values while computing mean and Variance.

The test set should ideally not be preprocessed with the training data. This will ensure no 'peeking ahead'. Train data should be preprocessed separately and once the model is created we can apply the same preprocessing parameters used for the train set, onto the test set as though the test set didn't exist before.

Predicting price using regression data model

Answers (2)

Related Questions