SaiLiu
SaiLiu

Reputation: 49

Does adding a feature certainly making the model better?

I have trained a GBDT model for predicting CTR, originally using 40 features. Then I added some features, but the AUC is lower than the original one.

  1. How could that happen?

  2. How do I determine which feature is good for the model?

Upvotes: 3

Views: 2809

Answers (2)

miguelmalvarez
miguelmalvarez

Reputation: 930

I agree that the most likely reason why adding more features produces worse results is overfitting, and that the main solution is feature selection.

Now, there are different techniques to verify and measure this intuition. One of the best tools is to produce the learning curves for the model given training and validation subsets.

A good example of this can be seen in this tutorials for the sklearn library (Python). Also, I strongly recommend you to have a look at the lecture about Learning Curves from the Machine Learning course by Andrew Ng in Coursera.

Upvotes: 0

sray
sray

Reputation: 584

If adding more features deteriorates performance, this is likely because of overfitting. Your model learning parameters need to be tuned to avoid overly complex(overfitted) models.

In case of random forests, the tree depth is one such parameter. Trees should not be allowed to grow too deep else they can overfit (this can happen in random forests even though there are lot of trees).

Upvotes: 2

Related Questions