Cross Validation in Classification

Question

I have two different datasets, datset X and dataset Y... From which I calculate features to use for classification..

Case 1. When I combine both together as one large datset then use 10 fold cross validation I get very good classification results with accuracy and AUC > 95%

Case2. Yet if I use one of the datasets for training and the other for testing, results fall severely low with both accuracy and AUC becoming ~ 50%

My questions are:

Which of the cases' results is more reliable??
And why the huge difference in results??

Thanks..

Cross Validation in Classification

Answers (1)

Related Questions