Reputation: 1567
I have previously used Random Forest for Classification task, setting the params using the example here as a guide. It works perfect. However now I want to solve a regression problem.
I kind of have some idea that this is related to the var_type Mat defining the type of data in the Random Forest train method, however not really sure what each of these flags correspond to.
For Classifcation task it looks like this (code copied from link above):
// define all the attributes as numerical
// alternatives are CV_VAR_CATEGORICAL or CV_VAR_ORDERED(=CV_VAR_NUMERICAL)
// that can be assigned on a per attribute basis
Mat var_type = Mat(ATTRIBUTES_PER_SAMPLE + 1, 1, CV_8U );
var_type.setTo(Scalar(CV_VAR_NUMERICAL) ); // all inputs are numerical
// this is a classification problem (i.e. predict a discrete number of class
// outputs) so reset the last (+1) output var_type element to CV_VAR_CATEGORICAL
var_type.at<uchar>(ATTRIBUTES_PER_SAMPLE, 0) = CV_VAR_CATEGORICAL;
And the params setup:
float priors[] = {1,1,1,1,1,1,1,1,1,1}; // weights of each classification for classes
// (all equal as equal samples of each digit)
CvRTParams params = CvRTParams(25, // max depth
5, // min sample count
0, // regression accuracy: N/A here
false, // compute surrogate split, no missing data
15, // max number of categories (use sub-optimal algorithm for larger numbers)
priors, // the array of priors
false, // calculate variable importance
4, // number of variables randomly selected at node and used to find the best split(s).
100, // max number of trees in the forest
0.01f, // forrest accuracy
CV_TERMCRIT_ITER | CV_TERMCRIT_EPS // termination cirteria
);
Training uses the var_type and params as follows:
CvRTrees* rtree = new CvRTrees;
rtree->train(training_data, CV_ROW_SAMPLE, training_classifications,
Mat(), Mat(), var_type, Mat(), params);
My question is that how can I set up OpenCV Random Forest so that it works as a regressor. I have searched a lot, but have not been able to find answer to this. The closest explanation I have got is in this answer. However it still does not makes any sense.
I am looking for a simple answer explaning the var_type and params for regression.
Upvotes: 3
Views: 6515
Reputation: 102
to use it for regression, you just have to set the var_type as CV_VAR_ORDERED i.e.
var_type.at<uchar>(ATTRIBUTES_PER_SAMPLE, 0) = CV_VAR_ORDERED;
and you might want to set the regression_accuracy to a very small number like 0.0001f.
Upvotes: 4