maldini425
maldini425

Reputation: 317

Interpreting Random Forest Model Results

I would really appreciate your feedback with the interpretation of my RF model and how to generally evaluate the results.

57658 samples
   27 predictor
    2 classes: 'stayed', 'left' 

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 11531, 11531, 11532, 11532, 11532 
Resampling results across tuning parameters:

  mtry  splitrule   ROC        Sens       Spec        
   2    gini        0.6273579  0.9999011  0.0006250729
   2    extratrees  0.6246980  0.9999197  0.0005667791
  14    gini        0.5968382  0.9324610  0.1116113149
  14    extratrees  0.6192781  0.9740323  0.0523004026
  27    gini        0.5584677  0.7546156  0.2977507092
  27    extratrees  0.5589923  0.7635036  0.2905489827

Tuning parameter 'min.node.size' was held constant at a value of 1
ROC was used to select the optimal model using the largest value.
The final values used for the model were mtry = 2, splitrule = gini and min.node.size = 1.

After making several adjustments to the functional form of my Y variable, as well as the way I am splitting my data, I got the following results: My ROC improved slightly, but interestingly my Sens & Spec changed drastically compared to my initial model.

35000 samples
   27 predictor
    2 classes: 'stayed', 'left' 

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 7000, 7000, 7000, 7000, 7000 
Resampling results across tuning parameters:

  mtry  splitrule   ROC        Sens          Spec     
   2    gini        0.6351733  0.0004618204  0.9998685
   2    extratrees  0.6287926  0.0000000000  0.9999899
  14    gini        0.6032979  0.1346653886  0.9170874
  14    extratrees  0.6235212  0.0753069696  0.9631711
  27    gini        0.5725621  0.3016414054  0.7575899
  27    extratrees  0.5716616  0.2998190728  0.7636219

Tuning parameter 'min.node.size' was held constant at a value of 1
ROC was used to select the optimal model using the largest value.
The final values used for the model were mtry = 2, splitrule = gini and min.node.size = 1.

This time, I split the data randomly, rather than by time and experimented with several mtry values using the following code:

```{r Cross Validation Part 1}
set.seed(1992) # setting a seed for replication purposes 

folds <- createFolds(train_data$left_welfare, k = 5) # Partition the data into 5 equal folds

tune_mtry <- expand.grid(mtry = c(2,10,15,20), splitrule = c("variance", "extratrees"), min.node.size = c(1,5,10))

sapply(folds,length)

And got the following results:

Random Forest 

84172 samples
   14 predictor
    2 classes: 'stayed', 'left' 

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 16834, 16834, 16834, 16835, 16835 
Resampling results across tuning parameters:

  mtry  splitrule   ROC        Sens       Spec     
   2    variance    0.5000000        NaN        NaN
   2    extratrees  0.7038724  0.3714761  0.8844723
   5    variance    0.5000000        NaN        NaN
   5    extratrees  0.7042525  0.3870192  0.8727755
   8    variance    0.5000000        NaN        NaN
   8    extratrees  0.7014818  0.4075797  0.8545012
  10    variance    0.5000000        NaN        NaN
  10    extratrees  0.6956536  0.4336180  0.8310368
  12    variance    0.5000000        NaN        NaN
  12    extratrees  0.6771292  0.4701687  0.7777730
  15    variance    0.5000000        NaN        NaN
  15    extratrees  0.5000000        NaN        NaN

Tuning parameter 'min.node.size' was held constant at a value of 1
ROC was used to select the optimal model using the largest value.
The final values used for the model were mtry = 5, splitrule = extratrees and min.node.size = 1.

Upvotes: 1

Views: 598

Answers (1)

Davide ND
Davide ND

Reputation: 994

It looks like your random forest has almost no predictive power on the second class "left". The best scores all have extremely high sensitivity and low specificity, which basically means that you classifier just classifies everything to class "stayed", which I imagine is the majority class. Unfortunately this is pretty bad, as it does not go too far from a naive classifier saying everything is from the first class.
Also, I can't quite understand if you only tried values for mtry 2,14 and 27, but in that case I would strongly suggest trying the whole 3-25 range (best values will most likely be somewhere in the middle).

Apart from that, since the performance looks to be rather bad (judging by the ROC) I suggest you work more on the feature engineering to extract some more information. Otherwise if you're OK with what you have or you think nothing more can be extracted, just tweak the probability threshold for the classification so that you have a sensitivity and specificity that mirror your requirement on the classes (you might care more about miscassifying "stayed" than "left" or vice versa, I dont know your problem).

Hope it helps!

Upvotes: 1

Related Questions