Girijesh Singh
Girijesh Singh

Reputation: 37

predicitng new value through a model trained on one hot encoded data

This might look like a trivial problem. But I am getting stuck in predicting results from a model. My problem is like this:

I have a dataset of shape 1000 x 19 (except target feature) but after one hot encoding it becomes 1000 x 141. Since I trained the model on the data which is of shape 1000 x 141, so I need data of shape 1 x 141 (at least) for prediction. I also know in python, I can make future prediction using

model.predict(data)

But, since I am getting data from an end user through a web portal which is shape of 1 x 19. Now I am very confused how should I proceed further to make predictions based on the user data.

How can I convert data of shape 1 x 19 into 1 x 141 as I have to maintain the same order with respect to train/test data means the order of column should not differ? Any help in this direction would be highly appreciated.

Upvotes: 2

Views: 3809

Answers (1)

secretive
secretive

Reputation: 2104

I am assuming that to create a one hot encoding, you are using sklearn onehotencoder. If you using that, then the problem should be solved easily. Since you are fitting the one hot encoder on your training data

from sklearn.preprocessing import OneHotEncoder
encoder = OneHotEncoder(categories = "auto", handle_unknown = "ignore")
X_train_encoded = encoder.fit_transform(X_train)

So now in the above code, your encoder is fitted on your training data so when you get the test data, you can transform it into the same encoded data using this fitted encoder.

test_data = encoder.transform(test_data)

Now your test data will also be of 1x141 shape. You can check shape using

(pd.DataFrame(test_data.toarray())).shape

Upvotes: 6

Related Questions