Reputation: 35
I have a binary classification problem where the data division is like :{0:85%,1:15%}. I have tried re-weighting class_weights and other sampling approches. But all the approaches that I have used is giving me unsatisfactory results. My dataset is (91125,57).
Accuracy:1
F1-Score:1
F2-Score:1
Precision:1
Recall:1
AUCROC:1
Kappa:1
Is there any other method I can use to handle such a situation?
Upvotes: 1
Views: 233
Reputation: 410
Make sure you're dropping the target variable from your features before feeding the data to the classifier:
X = df.drop('target',axis=1)
y = df['target']
I'd also check if some independent variables are highly correlated with the target. It may give your an idea what causes an unrealistically perfect classiification:
import seaborn as sns
sns.heatmap(X_train.corr())
Upvotes: 1