mauna
mauna

Reputation: 1118

How to interpret H2O's confusion matrix?

I am using h2o version 3.10.4.8.

library(magrittr)
library(h2o)

h2o.init(nthreads = -1, max_mem_size = "6g")

data.url <- "https://raw.githubusercontent.com/DarrenCook/h2o/bk/datasets/"

iris.hex <- paste0(data.url, "iris_wheader.csv") %>%
  h2o.importFile(destination_frame = "iris.hex")

y <- "class"
x <- setdiff(names(iris.hex), y)


model.glm <- h2o.glm(x, y, iris.hex, family = "multinomial")

preds <- h2o.predict(model.glm, iris.hex)

h2o.confusionMatrix(model.glm)
h2o.table(preds["predict"])

This is the output of h2o.confusionMatrix(model.glm):

Confusion Matrix: vertical: actual; across: predicted
                Iris-setosa Iris-versicolor Iris-virginica  Error      Rate
Iris-setosa              50               0              0 0.0000 =  0 / 50
Iris-versicolor           0              48              2 0.0400 =  2 / 50
Iris-virginica            0               1             49 0.0200 =  1 / 50
Totals                   50              49             51 0.0200 = 3 / 150

Since it says across:predicted, I interpret this to mean that the model made 50 (0 + 48 + 2) predictions that are Iris-versicolor.

This is the output of h2o.table(preds["predict"]):

          predict Count
1     Iris-setosa    50
2 Iris-versicolor    49
3  Iris-virginica    51

This tells me that the model made 49 predictions that are Iris-versicolor.

Is the confusion matrix incorrectly labelled or did I make a mistake in interpreting the results?

Upvotes: 2

Views: 3075

Answers (2)

Erin LeDell
Erin LeDell

Reputation: 8819

You did not make a mistake; the labels are confusing (and causing people to think that the rows and columns were switched). This was fixed recently and will be included in the next release of H2O.

Upvotes: 1

TomKraljevic
TomKraljevic

Reputation: 3671

Row names (vertical) are the actual labels.

Column names (across) are the predicted labels.

Upvotes: 3

Related Questions