Pallavi Patil
Pallavi Patil

Reputation: 11

train and test set for a ML algorithm

I have a model which is trained on 33 datasets with SVM using LOOCV. I collected another 13 datasets which I divide like leave one out. In the testing phase, I combine datasets from training (33) and 12 from test and have a model which is trained on 45 datasets and test on the remaining datasets iteratively (similar to LOOCV). Is this method of testing right? All the recordings are independent of each other and can be reoffered as IID.

Upvotes: 0

Views: 46

Answers (1)

bwe69
bwe69

Reputation: 103

No, LOOCV is only used for small datasets or when you want an accurate estimate of your model performance.

Let's say your train accuracy is 90%, your test accuracy may be 50%.
This is due to overfitting from the large train size and small test size.
Image of overfitting in ML models

Assuming your 45 dataset sizes are the same, your train test size will be 98% - 2%.
The general rule of thumb for train test size is 80% - 20%

You could use train_test_split, k-fold split, stratifiedshufflesplit etc. instead.

Upvotes: 1

Related Questions