Reputation: 1
I am working on churn prediction data set using logistic regression. The model is predicting 95% accuracy but confusion matrix is giving following output:
array([[1517, 0],
[ 70, 0]], dtype=int64)
How can I make model to predict true negatives too?
Upvotes: 0
Views: 4561
Reputation: 1
Logistic regression errors at times due to scaling issues - try out scaling all the variables regressed
Upvotes: 0
Reputation: 4258
I want to add to answer by PV8, as mentioned above this is a highly imbalanced dataset. You could look into different metrics such as ROC
, PR curve
, balanced accuracy score, stratified k fold cross validation
, adjust class weights and try under/over sampling
. It may be a good idea to try other mentioned approaches before going with sampling.
Since, you are using scikit-learn you can use imbalanced-learn package along with it, https://imbalanced-learn.readthedocs.io/en/stable/install.html. It will provide various under/over sampling algorithms, classifiers, metrics for imbalanced datasets.
Upvotes: 0
Reputation: 6260
This is a typical problem of inbalanced data.
Your logistic classification is only prediction one class (in this case class 0) and is not respecting any other outcome at all.
There are tons of keywords/ideas to solve this solution which would be outside of this scope here. To give you some buzzwords:
There is no basic solution for this kind of problem, you really need to work on that topic!
Upvotes: 1