EGM8686
EGM8686

Reputation: 1572

Python RandomForest classifier (how to test it)

I have been able to create a RandomForestClassifier on a dataset.

clf = RandomForestClassifier(n_estimators=100, random_state = 101)   

I can then use it on the test data like this:

prediction = pd.DataFrame(clf.predict(x)) # x = Matrix of predictor values 

So my question is, how can I test clf.predict outside of Python, how can I see the values that is using and how can I test it "manually" for example if you get the betas in a Regression you can then use those values in Excel and replicate the model. How to do this with RandomForests in Python?

Also is there a similar metric to Rsquared to test the model's explication power?

Thanks!

Upvotes: 0

Views: 108

Answers (1)

Franco Piccolo
Franco Piccolo

Reputation: 7410

The RandomForestClassifier is an ensemble of trees which means it is composed by multiple trees.

To be able to test the trees I would suggest to do it in Python itself, you can access all the trees in the estimators_ attribute of the classifier and subsequently export them as graphs with export_graphviz from sklearn.tree module.

If you insist on exporting the trees you will need to export all the rules that each tree is composed by. For that, you can follow this instructions from the sklearn docs.

Regarding the metrics, for a classification problem you could use accuracy_score from sklearn.metrics module.

Upvotes: 2

Related Questions