Chris T.
Chris T.

Reputation: 1801

`table` not showing in matrix format

I'm trying to generate a confusion table using the HMDA data from the AER package. So I ran a probit model, predict on testing set, and use table() function to generate a 2 by 2 plot, but R just returns me a long list, not showing the 2 by 2 matrix that I wanted.

Could anyone tell me what's going on>

# load required packages and data (HMDA)
library(e1071)
library(caret)
library(AER)
library(plotROC)
data(HMDA)

# again, check variable columns
names(HMDA)

# convert dependent variables to numeric
HMDA$deny <- ifelse(HMDA$deny == "yes", 1, 0)

# subset needed columns
subset <- c("deny", "hirat", "lvrat", "mhist", "unemp")

# subset data
data <- HMDA[complete.cases(HMDA), subset]

# do a 75-25 train-test split
train_row_numbers <- createDataPartition(data$deny, p=0.75, list=FALSE)
training <- data[train_row_numbers, ]
testing <- data[-train_row_numbers, ]


# fit a probit model and predict on testing data
probit.fit <- glm(deny ~ ., family = binomial(link = "probit"), data = training)
probit.pred <- predict(probit.fit, testing)

confmat_probit <- table(Predicted = probit.pred, 
               Actual = testing$deny)
confmat_probit

Upvotes: 0

Views: 42

Answers (1)

Edward
Edward

Reputation: 19259

You need to specify the threshold or cut-point for predicting a dichotomous outcome. Predict returns the predicted values, not 0 / 1.

And be careful with the predict function as the default type is "link", which in your case is the "probit". If you want predict to return the probabilities, specify type="response".

probit.pred <- predict(probit.fit, testing, type="response")

Then choose a cut-point; any prediction above this value will be TRUE:

confmat_probit <- table(`Predicted>0.1` = probit.pred > 0.1 , Actual = testing$deny)
confmat_probit

             Actual
Predicted>0.1   0   1
        FALSE 248  21
        TRUE  273  53

Upvotes: 1

Related Questions