How to make predictions using K-Nearest Neighbors (KNN) model when data has been normalized (Python)

Question

I have created a KNN model in Python (Module = Scikitlearn) by using three variables (Age, Distance, Travel Allowance) as my predictor variables, with the aim of using them to predict an outcome for the target variable (Method of Travel).

When constructing the model, I had to normalize the data for the three predictor variables (Age, Distance, Travel Allowance). This increased the accuracy of my model compared to not normalizing the data.

Now that I have constructed the model, I want to make a prediction. But how would I enter the predictor variables to make the prediction as the model has been trained on normalized data.

I want to enter KNN.predict([[30,2000,40]]) to carry out a prediction where Age = 30; Distance = 2000; Allowance = 40. But as the data has been normalized, I can't think of a way on how to do this. I used the following code to normalize the data:
X = preprocessing.StandardScaler().fit(X).transform(X.astype(float))

Michael C · Accepted Answer

Actually, the answer is buried in the code you provided!

Once you fit the instance of preprocessing.StandardScaler() it remembers how to scale data. Try this

scaler = preprocessing.StandardScaler().fit(X)
# scaler is an object that knows how to normalize data points
X_normalized = scaler.transform(X.astype(float))
# used scalar to normalize the data points in X
# Note, this is what you have done, just in two steps. 
# I just capture the scaler object 
#
# ... Train your model on X_normalized
#
# Now predict
other_data = [[30,2000,40]]
other_data_normalized = scaler.transform(other_data)
KNN.predict(other_data_normalized)

Notice that I used scaler.transform twice in the same way

See StandardScaler.transform

How to make predictions using K-Nearest Neighbors (KNN) model when data has been normalized (Python)

Answers (1)

Related Questions