R table to assess model performance--observed versus predicted class

Question

I'm involving a prediction of one variable with 10 levels,and I'm using rpart for classification. The certain code of forming the table is

as.vector(t(table(predict(bb.rt,set[train,],type="class"),response[train])))

But the result is bad: Observed Class→

Predicted Class ↓

               1    2   3   4    5  6    7 8   9   10
         1  26.0  0.0 0.6 0.0  0.0  0  0.0 0 0.0  0.2
         10  0.2  0.0 0.0 0.0  0.4  0  0.0 0 0.4 12.8
         2   0.0 45.6 0.6 1.4  0.6  0  0.0 0 0.0  0.0
         3   0.2  0.0 6.0 0.0  0.0  0  0.0 0 0.0  0.0
         4   0.0  0.2 0.0 3.4  0.0  0  0.0 0 0.0  0.0
         5   0.0  0.0 0.0 0.0 11.8  0  0.0 0 0.0  0.0
         6   0.0  0.0 0.0 0.0  0.0 19  0.0 0 0.0  0.0
         7   0.0  0.8 0.0 0.0  0.0  0 16.8 0 0.0  0.0
         8   0.0  0.0 0.0 0.0  0.0  0  0.0 4 0.0  0.0
         9   0.0  0.0 0.0 0.0  0.0  0  0.0 0 9.4  0.6

The predicted class is sorted in alphabetic order but the observed class is not. I need them sorted in the same way so that I can compair values which are on the diag(matrix) with other values.

doug · Accepted Answer

If i correctly understood your Question, it seems you just want a Confusion Matrix..

Of course they are not difficult to calculate manually, but there are (at least) a dozen built-in functions across the various R Packages that handle all of this for you--the data processing, table formatting, error checking, etc. The bulit-in function i use below also calculates classification error.

The package mda has a built-in function called confusion. You use like so:

> library(mda)
> data(iris)
> iris_fit = fda(Species ~., data=iris)

> CM = confusion(predict(iris_fit, iris), iris$Species)
> # observed classification (true) is column-wise;
> # predicted is row-wise 
> CM

            true
   predicted    setosa versicolor virginica
   setosa         50          0         0
   versicolor      0         48         1
   virginica       0          2        49

   attr(,"error")
   [1] 0.02

Again, there are many more functions from among the third-party packages on CRAN, to calculate the Confusion Matrix.

A quick search of the R Package space using the sos, gave these results:

> library(sos)

> findFn("confusion", maxPages=5, sortby="MaxScore")

i deliberately limited this earch to just the top 5 pages of results (87 individual functions returned). From these results, other R Packages which have a confusion matrix function:

zmisclassification.matrix in package fpc
panr.confusion in package pamr
confusion in package DAAG

R table to assess model performance--observed versus predicted class

Answers (2)

Related Questions