Tong He
Tong He

Reputation: 286

R table to assess model performance--observed versus predicted class

I'm involving a prediction of one variable with 10 levels,and I'm using rpart for classification. The certain code of forming the table is

as.vector(t(table(predict(bb.rt,set[train,],type="class"),response[train])))

But the result is bad: Observed Class→

Predicted Class ↓

               1    2   3   4    5  6    7 8   9   10
         1  26.0  0.0 0.6 0.0  0.0  0  0.0 0 0.0  0.2
         10  0.2  0.0 0.0 0.0  0.4  0  0.0 0 0.4 12.8
         2   0.0 45.6 0.6 1.4  0.6  0  0.0 0 0.0  0.0
         3   0.2  0.0 6.0 0.0  0.0  0  0.0 0 0.0  0.0
         4   0.0  0.2 0.0 3.4  0.0  0  0.0 0 0.0  0.0
         5   0.0  0.0 0.0 0.0 11.8  0  0.0 0 0.0  0.0
         6   0.0  0.0 0.0 0.0  0.0 19  0.0 0 0.0  0.0
         7   0.0  0.8 0.0 0.0  0.0  0 16.8 0 0.0  0.0
         8   0.0  0.0 0.0 0.0  0.0  0  0.0 4 0.0  0.0
         9   0.0  0.0 0.0 0.0  0.0  0  0.0 0 9.4  0.6

The predicted class is sorted in alphabetic order but the observed class is not. I need them sorted in the same way so that I can compair values which are on the diag(matrix) with other values.

Upvotes: 1

Views: 3573

Answers (2)

doug
doug

Reputation: 70028

If i correctly understood your Question, it seems you just want a Confusion Matrix..

Of course they are not difficult to calculate manually, but there are (at least) a dozen built-in functions across the various R Packages that handle all of this for you--the data processing, table formatting, error checking, etc. The bulit-in function i use below also calculates classification error.

The package mda has a built-in function called confusion. You use like so:

> library(mda)
> data(iris)
> iris_fit = fda(Species ~., data=iris)

> CM = confusion(predict(iris_fit, iris), iris$Species)
> # observed classification (true) is column-wise;
> # predicted is row-wise 
> CM

            true
   predicted    setosa versicolor virginica
   setosa         50          0         0
   versicolor      0         48         1
   virginica       0          2        49

   attr(,"error")
   [1] 0.02

Again, there are many more functions from among the third-party packages on CRAN, to calculate the Confusion Matrix.

A quick search of the R Package space using the sos, gave these results:

> library(sos)

> findFn("confusion", maxPages=5, sortby="MaxScore")

i deliberately limited this earch to just the top 5 pages of results (87 individual functions returned). From these results, other R Packages which have a confusion matrix function:

  • zmisclassification.matrix in package fpc

  • panr.confusion in package pamr

  • confusion in package DAAG

Upvotes: 1

csgillespie
csgillespie

Reputation: 60452

You just need to rearrange the columns, using the standard subseting operator [] First, create some example data:

R> dd = data.frame(x=1:4, z=5:8, y=10:13)
R> rownames(dd) = 4:1  
R> dd
  x z  y
4 1 5 10
3 2 6 11
2 3 7 12
1 4 8 13

Next I specify the order of the rows and columns:

R> dd[sort(rownames(dd)), sort(colnames(dd))]
  x  y z
1 4 13 8
2 3 12 7
3 2 11 6
4 1 10 5

Upvotes: 1

Related Questions