Reputation: 317
I would really appreciate your feedback with the interpretation of my RF model and how to generally evaluate the results.
57658 samples
27 predictor
2 classes: 'stayed', 'left'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 11531, 11531, 11532, 11532, 11532
Resampling results across tuning parameters:
mtry splitrule ROC Sens Spec
2 gini 0.6273579 0.9999011 0.0006250729
2 extratrees 0.6246980 0.9999197 0.0005667791
14 gini 0.5968382 0.9324610 0.1116113149
14 extratrees 0.6192781 0.9740323 0.0523004026
27 gini 0.5584677 0.7546156 0.2977507092
27 extratrees 0.5589923 0.7635036 0.2905489827
Tuning parameter 'min.node.size' was held constant at a value of 1
ROC was used to select the optimal model using the largest value.
The final values used for the model were mtry = 2, splitrule = gini and min.node.size = 1.
After making several adjustments to the functional form of my Y variable, as well as the way I am splitting my data, I got the following results: My ROC improved slightly, but interestingly my Sens & Spec changed drastically compared to my initial model.
35000 samples
27 predictor
2 classes: 'stayed', 'left'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 7000, 7000, 7000, 7000, 7000
Resampling results across tuning parameters:
mtry splitrule ROC Sens Spec
2 gini 0.6351733 0.0004618204 0.9998685
2 extratrees 0.6287926 0.0000000000 0.9999899
14 gini 0.6032979 0.1346653886 0.9170874
14 extratrees 0.6235212 0.0753069696 0.9631711
27 gini 0.5725621 0.3016414054 0.7575899
27 extratrees 0.5716616 0.2998190728 0.7636219
Tuning parameter 'min.node.size' was held constant at a value of 1
ROC was used to select the optimal model using the largest value.
The final values used for the model were mtry = 2, splitrule = gini and min.node.size = 1.
This time, I split the data randomly, rather than by time and experimented with several mtry values using the following code:
```{r Cross Validation Part 1}
set.seed(1992) # setting a seed for replication purposes
folds <- createFolds(train_data$left_welfare, k = 5) # Partition the data into 5 equal folds
tune_mtry <- expand.grid(mtry = c(2,10,15,20), splitrule = c("variance", "extratrees"), min.node.size = c(1,5,10))
sapply(folds,length)
And got the following results:
Random Forest
84172 samples
14 predictor
2 classes: 'stayed', 'left'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 16834, 16834, 16834, 16835, 16835
Resampling results across tuning parameters:
mtry splitrule ROC Sens Spec
2 variance 0.5000000 NaN NaN
2 extratrees 0.7038724 0.3714761 0.8844723
5 variance 0.5000000 NaN NaN
5 extratrees 0.7042525 0.3870192 0.8727755
8 variance 0.5000000 NaN NaN
8 extratrees 0.7014818 0.4075797 0.8545012
10 variance 0.5000000 NaN NaN
10 extratrees 0.6956536 0.4336180 0.8310368
12 variance 0.5000000 NaN NaN
12 extratrees 0.6771292 0.4701687 0.7777730
15 variance 0.5000000 NaN NaN
15 extratrees 0.5000000 NaN NaN
Tuning parameter 'min.node.size' was held constant at a value of 1
ROC was used to select the optimal model using the largest value.
The final values used for the model were mtry = 5, splitrule = extratrees and min.node.size = 1.
Upvotes: 1
Views: 598
Reputation: 994
It looks like your random forest has almost no predictive power on the second class "left".
The best scores all have extremely high sensitivity and low specificity, which basically means that you classifier just classifies everything to class "stayed", which I imagine is the majority class. Unfortunately this is pretty bad, as it does not go too far from a naive classifier saying everything is from the first class.
Also, I can't quite understand if you only tried values for mtry 2,14 and 27, but in that case I would strongly suggest trying the whole 3-25 range (best values will most likely be somewhere in the middle).
Apart from that, since the performance looks to be rather bad (judging by the ROC) I suggest you work more on the feature engineering to extract some more information. Otherwise if you're OK with what you have or you think nothing more can be extracted, just tweak the probability threshold for the classification so that you have a sensitivity and specificity that mirror your requirement on the classes (you might care more about miscassifying "stayed" than "left" or vice versa, I dont know your problem).
Hope it helps!
Upvotes: 1