Evaluating the model in WEKA

Question

I have applied classification algorithm on dataset and came out with below stats:

Correctly Classified Instances         684               76.1693 %
Incorrectly Classified Instances       214               23.8307 %
Kappa statistic                          0     
Mean absolute error                      0.1343
Root mean squared error                  0.2582
Relative absolute error                100      %
Root relative squared error            100      %
Total Number of Instances              898     

=== Detailed Accuracy By Class ===

               TP Rate   FP Rate   Precision   Recall  F-Measure   ROC Area  Class
                 0         0          0         0         0          0.5      1
                 0         0          0         0         0          0.5      2
                 1         1          0.762     1         0.865      0.5      3
                 0         0          0         0         0          ?        4
                 0         0          0         0         0          0.5      5
                 0         0          0         0         0          0.5      U
Weighted Avg.    0.762     0.762      0.58      0.762     0.659      0.5  

=== Confusion Matrix ===

   a   b   c   d   e   f   <-- classified as
   0   0   8   0   0   0 |   a = 1
   0   0  99   0   0   0 |   b = 2
   0   0 684   0   0   0 |   c = 3
   0   0   0   0   0   0 |   d = 4
   0   0  67   0   0   0 |   e = 5
   0   0  40   0   0   0 |   f = U

I can understand much of the data however there is a problem interpreting the values since i am new to Weka: 1. Which error rate to report overall? 2. How to interpret if something interesting about the model?

dedek · Accepted Answer

1) Overall error measure

The triplet Precision, Recall and F-Measure together is reported quite often because each number represents a different aspect of the model.

If would like to have a single number only then take Percent (In)correctly Classified Instances or Weighted Avg. F-Measure.

The other error measures are also useful but they require deeper knowledge of statistics (which I'm lacking :-)

2) Something interesting about the model

From Detailed Accuracy By Class and Confusion Matrix you can see that the model is quite simple. It classifies everything as class 3. The error measures looks quite successful, but it is just because 76% of instances in the dataset have the class 3. The model corresponds with often used baseline algorithm called "most common class".

Evaluating the model in WEKA

Answers (2)

1) Overall error measure

2) Something interesting about the model

Related Questions