Why all the true positives are classified as true negatives in the machine learning model?

Question

I fit a random forest model for the data. I divided my dataset into training and testing in the ratio of 70:30 and trained the model. I got an accuracy of 80% for the test data. Then I took a benchmark dataset and tested the model with that dataset. That dataset only contained data with true labels(1). But when I get the prediction for the benchmark dataset using the model all the true positives are classified as true negatives. Accuracy is 90%. Why is that? Is there a way to interpret this?

X = dataset.iloc[:, 1:11].values    
y=dataset.iloc[:,11].values

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=1,shuffle='true')

XBench_test=benchmarkData.iloc[:, 1:11].values
YBench_test=benchmarkData.iloc[:,11].values

classifier=RandomForestClassifier(n_estimators=35,criterion='entropy',max_depth=30,min_samples_split=2,min_samples_leaf=1,max_features='sqrt',class_weight='balanced',bootstrap='true',random_state=0,oob_score='true')
classifier.fit(X_train,y_train)
y_pred=classifier.predict(X_test)

y_pred_benchmark=classifier.predict(XBench_test)

print("Accuracy on test data: {:.4f}".format(classifier.score(X_test, y_test)))\*This gives 80%*\

print("Accuracy on benchmark data: {:.4f}".format(classifier.score(XBench_test, YBench_test))) \*This gives 90%*\

Why all the true positives are classified as true negatives in the machine learning model?

Answers (1)

Related Questions