Kasia
Kasia

Reputation: 51

Advice re: retaining client ID when training machine learning model

I am looking to develop a machine learning model that would predict staff performance (e.g. staff ID 12345 will sell 15 insurance products next month.) I don't want to input staff ID into the training dataset because it will skew results. However I do need to be able to associate each staff with their predicted performance once the model is functional.

Is the only way to go about this to develop the model excluding staff detail, then for prediction passing in a dataframe w/o staff ID, then associate the model output with staff detail by index / instance order?

It just seems like a round-about way for doing this.

Upvotes: 0

Views: 495

Answers (1)

Jujie YANG
Jujie YANG

Reputation: 36

I think so. That is the only way I can think of too. Because you need to know you should not include the staff ID as the training data in your training model.

Since you have used the Pandas module, you can easily search for which staff you want by using the DataFrame. Don't worry. I think it is a quite straightforward and fast way to map your predictions back to the staff IDs.

Sorry for not providing a new and better way. But I don't think you need to worry too much about the existing solutions, because I can't think of any bad effects like runtime. Hope it is helpful for you.

Upvotes: 1

Related Questions