Asra Khalid
Asra Khalid

Reputation: 1

Confusion matrix: What does it mean to have 0 value in true negative?

I am working on churn prediction data set using logistic regression. The model is predicting 95% accuracy but confusion matrix is giving following output:

array([[1517,    0],
       [  70,    0]], dtype=int64)

How can I make model to predict true negatives too?

Upvotes: 0

Views: 4561

Answers (3)

TarunD
TarunD

Reputation: 1

Logistic regression errors at times due to scaling issues - try out scaling all the variables regressed

Upvotes: 0

B200011011
B200011011

Reputation: 4258

I want to add to answer by PV8, as mentioned above this is a highly imbalanced dataset. You could look into different metrics such as ROC, PR curve, balanced accuracy score, stratified k fold cross validation, adjust class weights and try under/over sampling. It may be a good idea to try other mentioned approaches before going with sampling.

Since, you are using scikit-learn you can use imbalanced-learn package along with it, https://imbalanced-learn.readthedocs.io/en/stable/install.html. It will provide various under/over sampling algorithms, classifiers, metrics for imbalanced datasets.

Upvotes: 0

PV8
PV8

Reputation: 6260

This is a typical problem of inbalanced data.

Your logistic classification is only prediction one class (in this case class 0) and is not respecting any other outcome at all.

There are tons of keywords/ideas to solve this solution which would be outside of this scope here. To give you some buzzwords:

  • Over/Undersampling
  • Outlier detection
  • Change classifier optimization problem

There is no basic solution for this kind of problem, you really need to work on that topic!

Upvotes: 1

Related Questions