Simple Decision Tree in R - Strange Results From Caret Package

Question

I'm trying to apply a simple decision tree to the following data set using the caret package, the data is:

> library(caret)
> mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
> mydata$rank <- factor(mydata$rank)
  # create dummy variables
> X = predict(dummyVars(~ ., data=mydata), mydata)
> head(X)

    A matrix: 6 × 7 of type dbl     
admit   gre gpa rank.1  rank.2  rank.3  rank.4
    0   380 3.61    0        0        1      0
    1   660 3.67    0        0        1      0
    1   800 4.00    1        0        0      0
    1   640 3.19    0        0        0      1
    0   520 2.93    0        0        0      1
    1   760 3.00    0        1        0      0

Splitting into a training and testing set:

> trainset <- data.frame(X[1:300,])
> testset <- data.frame(X[301:400,])

Now applying the decision tree:

> tree <- train(factor(admit) ~., data = trainset, method = "rpart")
> tree

CART 

300 samples
  6 predictor
  2 classes: '0', '1' 

No pre-processing
Resampling: Bootstrapped (25 reps) 
Summary of sample sizes: 300, 300, 300, 300, 300, 300, ... 
Resampling results across tuning parameters:

 cp          Accuracy   Kappa    
0.01956522  0.6856163  0.1865179
0.03260870  0.6888378  0.1684015
0.08695652  0.7080434  0.1079462

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was cp = 0.08695652.

I get NaN in variable importance! Why?

> varImp(tree)$importance

A data.frame: 6 × 1     Overall

gre NaN
gpa NaN
rank.1  NaN
rank.2  NaN
rank.3  NaN
rank.4  NaN

and in prediction the decision tree only outputs one class, the 0 class, why? What's wrong with my code? Thanks in advance.

> y_pred <- predict(tree ,newdata=testset)
> y_test <- factor(testset$admit)
> confusionMatrix(y_pred, factor(y_test))

Confusion Matrix and Statistics

      Reference
Prediction  0  1
         0 65 35
         1  0  0

           Accuracy : 0.65            
             95% CI : (0.5482, 0.7427)
No Information Rate : 0.65            
P-Value [Acc > NIR] : 0.5458          

              Kappa : 0               

Mcnemar's Test P-Value : 9.081e-09       

        Sensitivity : 1.00            
        Specificity : 0.00            
     Pos Pred Value : 0.65            
     Neg Pred Value :  NaN            
         Prevalence : 0.65            
     Detection Rate : 0.65            
 Detection Prevalence : 1.00            
  Balanced Accuracy : 0.50            

   'Positive' Class : 0

Simple Decision Tree in R - Strange Results From Caret Package

Answers (1)

Related Questions