Reputation: 7400
I have a data set comprising a vector of features, and a target - either 1.0 or 0.0 (representing two classes). If I fit a RandomForestRegressor
and call its predict
function, is it equivalent to using RandomForestClassifier.predict_proba()
?
In other words if the target is 1.0 or 0.0 does RandomForestRegressor
output probabilities?
I think so, and the results I a m getting suggest so, but I would like to get a second opinion...
Thanks Weasel
Upvotes: 4
Views: 5541
Reputation: 48357
There is a major conceptual diffrence between those, based on different tasks being addressed:
Regression: continuous (real-valued) target variable.
Classification: discrete target variable (classes).
For a general classification method, term probability of observation being class X
may be not defined, as some classification methods, knn
for example, do not deal with probabilities.
However for Random Forest (and some other classification methods), classification is reduced to regression of classes probabilities destibution. Predicted class is taked then as argmax of computed "probabilities". In your case, you feed the same input, you get the same result. And yes, it is ok to treat values returned by RandomForestRegressor
as probabilities.
Upvotes: 4