Amir
Amir

Reputation: 1087

Random Forest Feature Importance Robustness with Python

I am using Random Forest from Sklearn for feature importance. However, the importance of features may change by changing the random_state parameter in RF. I am wondering if there is any way to get robust feature importance with RF?

Upvotes: 0

Views: 102

Answers (1)

puhuk
puhuk

Reputation: 484

it is because of the principal of Random Forest algorithm. RF finds the optimal by heuristic greedy way. And working on such heuristic way, it mitigates multiple trees with randomly sampled features and samples. And here random_state gives random numbers for sampling. If you see below documents, it says

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

[https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html][1]

So if you set random_state with fixed value, you may have fixed value for feature importance. It does not guarantee robustness because RF is not the algorithms guarantee robustness, but gives answer based on its heuristic finding.

Upvotes: 0

Related Questions