Reputation: 1572
I have been able to create a RandomForestClassifier on a dataset.
clf = RandomForestClassifier(n_estimators=100, random_state = 101)
I can then use it on the test data like this:
prediction = pd.DataFrame(clf.predict(x)) # x = Matrix of predictor values
So my question is, how can I test clf.predict outside of Python, how can I see the values that is using and how can I test it "manually" for example if you get the betas in a Regression you can then use those values in Excel and replicate the model. How to do this with RandomForests in Python?
Also is there a similar metric to Rsquared to test the model's explication power?
Thanks!
Upvotes: 0
Views: 108
Reputation: 7410
The RandomForestClassifier
is an ensemble of trees which means it is composed by multiple trees.
To be able to test the trees I would suggest to do it in Python itself, you can access all the trees in the estimators_
attribute of the classifier and subsequently export them as graphs with export_graphviz
from sklearn.tree
module.
If you insist on exporting the trees you will need to export all the rules that each tree is composed by. For that, you can follow this instructions from the sklearn docs.
Regarding the metrics, for a classification problem you could use accuracy_score
from sklearn.metrics
module.
Upvotes: 2