Stupid420
Stupid420

Reputation: 1419

How to handle a situation of feature scaling in machine learning model deployment when you have only one testing instance?

I am developing a Neural Network Model for classification problem. The number of features is about 1500 and all these features have very different ranges. I trained the model using feature normalization and got better results. Now when I am gonna deploy my model, the user will test my model with only one test example. I am wondering as my network in house train/test were normalized but the user test is a single test example and It can not be normalized as it's not a set of examples but just a single instance. How would my model handle this situation?

Upvotes: 3

Views: 720

Answers (2)

I agree with Janu. I guess it's also worth of effort to override predict or write a wrapper on top of your model so that you don't have to worry about this problem. What i would do is write a (interface) method which user tester can invoke to predict using the trained model. Inside predict i would transform the data similar to training data and then do the prediction. In python we can use StandardScaler from sklearn preprocessing module. pseudo code is below.

def predict(testinstance):
    scaler = StandardScaler()
    scaler.fit(traningData)
    testingData= scaler.transform(testinstance)
    return model.predict(testingData)

This way you don't have to keep track of other variables related to statistics.

Upvotes: 2

janu777
janu777

Reputation: 1978

Always normalize the test samples with the same values with which your training data was normalized.

So you won't have a problem.

You should not normalize the test data separately because the model will perform differently.

Example: calculate the mean and standard deviation of your training set. You will be using those values to normalize the training set. Now just use the same mean and standard deviation on you test samples also.

This should solve the issue.

Upvotes: 3

Related Questions