Reputation: 65
What is the difference between using "mse" and "class" in the glmnet package?
log_x <- model.matrix(response~.,train)
log_y <- ifelse(train$response=="good",1,0)
log_cv <- cv.glmnet(log_x,log_y,alpha=1,family="binomial", type.measure = "class")
summary(log_cv)
plot(log_cv)
vs.
log_x <- model.matrix(response~.,train)
log_y <- ifelse(train$response=="good",1,0)
log_cv <- cv.glmnet(log_x,log_y,alpha=1,family="binomial", type.measure = "mse")
summary(log_cv)
plot(log_cv)
I'm noticing that I'm getting a slightly different curve, or smootness in my plot, and a few % difference in accuracy. But for predicting a binnomial class response is one type measure more appropriate than the other?
Upvotes: 4
Views: 3629
Reputation: 9
For the binomial family the default in cv.glmnet is not the misclassification error but the deviance.
Secondly, you would never use the mean squared error loss for the binomial family.
Appropriate type meausres for the binomial family in cv.glmnet are the "default"(deviance), "auc" (the auc) and "class" (the misclassification error). Thats it.
Upvotes: 0
Reputation: 2652
It depends on your case study and what you want to learn from your model. From the help files
The default is type.measure="deviance", which uses squared-error for gaussian models (a.k.a type.measure="mse" there) [...]. type.measure="class" applies to binomial and multinomial logistic regression only, and gives misclassification error
Therefore, you have to ask yourself whether, in your problem, you want to minimize misclassification error or the mean squared error.
There is no straight forward answer to which is best. They are two different statistics from which the model decides what is the best penalization parameter to go for given the different models generated by the cross validation.
Upvotes: 2